Walter Daelemans

Also published as: W. Daelemans


2018

pdf bib
Enhancing General Sentiment Lexicons for Domain-Specific Use
Tim Kreutz | Walter Daelemans
Proceedings of the 27th International Conference on Computational Linguistics

Lexicon based methods for sentiment analysis rely on high quality polarity lexicons. In recent years, automatic methods for inducing lexicons have increased the viability of lexicon based methods for polarity classification. SentProp is a framework for inducing domain-specific polarities from word embeddings. We elaborate on SentProp by evaluating its use for enhancing DuOMan, a general-purpose lexicon, for use in the political domain. By adding only top sentiment bearing words from the vocabulary and applying small polarity shifts in the general-purpose lexicon, we increase accuracy in an in-domain classification task. The enhanced lexicon performs worse than the original lexicon in an out-domain task, showing that the words we added and the polarity shifts we applied are domain-specific and do not translate well to an out-domain setting.

pdf bib
Exploring Classifier Combinations for Language Variety Identification
Tim Kreutz | Walter Daelemans
Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018)

This paper describes CLiPS’s submissions for the Discriminating between Dutch and Flemish in Subtitles (DFS) shared task at VarDial 2018. We explore different ways to combine classifiers trained on different feature groups. Our best system uses two Linear SVM classifiers; one trained on lexical features (word n-grams) and one trained on syntactic features (PoS n-grams). The final prediction for a document to be in Flemish Dutch or Netherlandic Dutch is made by the classifier that outputs the highest probability for one of the two labels. This confidence vote approach outperforms a meta-classifier on the development data and on the test data.

pdf bib
Rule induction for global explanation of trained models
Madhumita Sushil | Simon Šuster | Walter Daelemans
Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP

Understanding the behavior of a trained network and finding explanations for its outputs is important for improving the network’s performance and generalization ability, and for ensuring trust in automated systems. Several approaches have previously been proposed to identify and visualize the most important features by analyzing a trained network. However, the relations between different features and classes are lost in most cases. We propose a technique to induce sets of if-then-else rules that capture these relations to globally explain the predictions of a network. We first calculate the importance of the features in the trained network. We then weigh the original inputs with these feature importance scores, simplify the transformed input space, and finally fit a rule induction model to explain the model predictions. We find that the output rule-sets can explain the predictions of a neural network trained for 4-class text classification from the 20 newsgroups dataset to a macro-averaged F-score of 0.80. We make the code available at https://github.com/clips/interpret_with_rules.

pdf bib
Revisiting neural relation classification in clinical notes with external information
Simon Šuster | Madhumita Sushil | Walter Daelemans
Proceedings of the Ninth International Workshop on Health Text Mining and Information Analysis

Recently, segment convolutional neural networks have been proposed for end-to-end relation extraction in the clinical domain, achieving results comparable to or outperforming the approaches with heavy manual feature engineering. In this paper, we analyze the errors made by the neural classifier based on confusion matrices, and then investigate three simple extensions to overcome its limitations. We find that including ontological association between drugs and problems, and data-induced association between medical concepts does not reliably improve the performance, but that large gains are obtained by the incorporation of semantic classes to capture relation triggers.

pdf bib
Predicting Adolescents’ Educational Track from Chat Messages on Dutch Social Media
Lisa Hilte | Walter Daelemans | Reinhild Vandekerckhove
Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

We aim to predict Flemish adolescents’ educational track based on their Dutch social media writing. We distinguish between the three main types of Belgian secondary education: General (theory-oriented), Vocational (practice-oriented), and Technical Secondary Education (hybrid). The best results are obtained with a Naive Bayes model, i.e. an F-score of 0.68 (std. dev. 0.05) in 10-fold cross-validation experiments on the training data and an F-score of 0.60 on unseen data. Many of the most informative features are character n-grams containing specific occurrences of chatspeak phenomena such as emoticons. While the detection of the most theory- and practice-oriented educational tracks seems to be a relatively easy task, the hybrid Technical level appears to be much harder to capture based on online writing style, as expected.

pdf bib
WordKit: a Python Package for Orthographic and Phonological Featurization
Stéphan Tulkens | Dominiek Sandra | Walter Daelemans
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
From Strings to Other Things: Linking the Neighborhood and Transposition Effects in Word Reading
Stéphan Tulkens | Dominiek Sandra | Walter Daelemans
Proceedings of the 22nd Conference on Computational Natural Language Learning

We investigate the relation between the transposition and deletion effects in word reading, i.e., the finding that readers can successfully read “SLAT” as “SALT”, or “WRK” as “WORK”, and the neighborhood effect. In particular, we investigate whether lexical orthographic neighborhoods take into account transposition and deletion in determining neighbors. If this is the case, it is more likely that the neighborhood effect takes place early during processing, and does not solely rely on similarity of internal representations. We introduce a new neighborhood measure, rd20, which can be used to quantify neighborhood effects over arbitrary feature spaces. We calculate the rd20 over large sets of words in three languages using various feature sets and show that feature sets that do not allow for transposition or deletion explain more variance in Reaction Time (RT) measurements. We also show that the rd20 can be calculated using the hidden state representations of an Multi-Layer Perceptron, and show that these explain less variance than the raw features. We conclude that the neighborhood effect is unlikely to have a perceptual basis, but is more likely to be the result of items co-activating after recognition. All code is available at: www.github.com/clips/conll2018

pdf bib
CliCR: a Dataset of Clinical Case Reports for Machine Reading Comprehension
Simon Šuster | Walter Daelemans
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

We present a new dataset for machine comprehension in the medical domain. Our dataset uses clinical case reports with around 100,000 gap-filling queries about these cases. We apply several baselines and state-of-the-art neural readers to the dataset, and observe a considerable gap in performance (20% F1) between the best human and machine readers. We analyze the skills required for successful answering and show how reader performance varies depending on the applicable skills. We find that inferences using domain knowledge and object tracking are the most frequently required skills, and that recognizing omitted information and spatio-temporal reasoning are the most difficult for the machines.

2017

pdf bib
A Short Review of Ethical Challenges in Clinical Natural Language Processing
Simon Šuster | Stéphan Tulkens | Walter Daelemans
Proceedings of the First ACL Workshop on Ethics in Natural Language Processing

Clinical NLP has an immense potential in contributing to how clinical practice will be revolutionized by the advent of large scale processing of clinical records. However, this potential has remained largely untapped due to slow progress primarily caused by strict data access policies for researchers. In this paper, we discuss the concern for privacy and the measures it entails. We also suggest sources of less sensitive data. Finally, we draw attention to biases that can compromise the validity of empirical research and lead to socially harmful applications.

pdf bib
Unsupervised Context-Sensitive Spelling Correction of Clinical Free-Text with Word and Character N-Gram Embeddings
Pieter Fivez | Simon Šuster | Walter Daelemans
BioNLP 2017

We present an unsupervised context-sensitive spelling correction method for clinical free-text that uses word and character n-gram embeddings. Our method generates misspelling replacement candidates and ranks them according to their semantic fit, by calculating a weighted cosine similarity between the vectorized representation of a candidate and the misspelling context. We greatly outperform two baseline off-the-shelf spelling correction tools on a manually annotated MIMIC-III test set, and counter the frequency bias of an optimized noisy channel model, showing that neural embeddings can be successfully exploited to include context-awareness in a spelling correction model.

pdf bib
Simple Queries as Distant Labels for Predicting Gender on Twitter
Chris Emmery | Grzegorz Chrupała | Walter Daelemans
Proceedings of the 3rd Workshop on Noisy User-generated Text

The majority of research on extracting missing user attributes from social media profiles use costly hand-annotated labels for supervised learning. Distantly supervised methods exist, although these generally rely on knowledge gathered using external sources. This paper demonstrates the effectiveness of gathering distant labels for self-reported gender on Twitter using simple queries. We confirm the reliability of this query heuristic by comparing with manual annotation. Moreover, using these labels for distant supervision, we demonstrate competitive model performance on the same data as models trained on manual annotations. As such, we offer a cheap, extensible, and fast alternative that can be employed beyond the task of gender classification.

pdf bib
Assessing the Stylistic Properties of Neurally Generated Text in Authorship Attribution
Enrique Manjavacas | Jeroen De Gussem | Walter Daelemans | Mike Kestemont
Proceedings of the Workshop on Stylistic Variation

Recent applications of neural language models have led to an increased interest in the automatic generation of natural language. However impressive, the evaluation of neurally generated text has so far remained rather informal and anecdotal. Here, we present an attempt at the systematic assessment of one aspect of the quality of neurally generated text. We focus on a specific aspect of neural language generation: its ability to reproduce authorial writing styles. Using established models for authorship attribution, we empirically assess the stylistic qualities of neurally generated text. In comparison to conventional language models, neural models generate fuzzier text, that is relatively harder to attribute correctly. Nevertheless, our results also suggest that neurally generated text offers more valuable perspectives for the augmentation of training data.

pdf bib
Towards the Improvement of Automatic Emotion Pre-annotation with Polarity and Subjective Information
Lea Canales | Walter Daelemans | Ester Boldrini | Patricio Martínez-Barco
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

Emotion detection has a high potential positive impact on the benefit of business, society, politics or education. Given this, the main objective of our research is to contribute to the resolution of one of the most important challenges in textual emotion detection: emotional corpora annotation. This will be tackled by proposing a semi-automatic methodology. It consists in two main phases: (1) an automatic process to pre-annotate the unlabelled sentences with a reduced number of emotional categories; and (2) a manual process of refinement where human annotators will determine which is the dominant emotion between the pre-defined set. Our objective in this paper is to show the pre-annotation process, as well as to evaluate the usability of subjective and polarity information in this process. The evaluation performed confirms clearly the benefits of employing the polarity and subjective information on emotion detection and thus endorses the relevance of our approach.

2016

pdf bib
Using Distributed Representations to Disambiguate Biomedical and Clinical Concepts
Stéphan Tulkens | Simon Suster | Walter Daelemans
Proceedings of the 15th Workshop on Biomedical Natural Language Processing

pdf bib
TwiSty: A Multilingual Twitter Stylometry Corpus for Gender and Personality Profiling
Ben Verhoeven | Walter Daelemans | Barbara Plank
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Personality profiling is the task of detecting personality traits of authors based on writing style. Several personality typologies exist, however, the Briggs-Myer Type Indicator (MBTI) is particularly popular in the non-scientific community, and many people use it to analyse their own personality and talk about the results online. Therefore, large amounts of self-assessed data on MBTI are readily available on social-media platforms such as Twitter. We present a novel corpus of tweets annotated with the MBTI personality type and gender of their author for six Western European languages (Dutch, German, French, Italian, Portuguese and Spanish). We outline the corpus creation and annotation, show statistics of the obtained data distributions and present first baselines on Myers-Briggs personality profiling and gender prediction for all six languages.

pdf bib
Evaluating Unsupervised Dutch Word Embeddings as a Linguistic Resource
Stéphan Tulkens | Chris Emmery | Walter Daelemans
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Word embeddings have recently seen a strong increase in interest as a result of strong performance gains on a variety of tasks. However, most of this research also underlined the importance of benchmark datasets, and the difficulty of constructing these for a variety of language-specific tasks. Still, many of the datasets used in these tasks could prove to be fruitful linguistic resources, allowing for unique observations into language use and variability. In this paper we demonstrate the performance of multiple types of embeddings, created with both count and prediction-based architectures on a variety of corpora, in two language-specific tasks: relation evaluation, and dialect identification. For the latter, we compare unsupervised methods with a traditional, hand-crafted dictionary. With this research, we provide the embeddings themselves, the relation evaluation task benchmark for use in further research, and demonstrate how the benchmarked embeddings prove a useful unsupervised linguistic resource, effectively used in a downstream task.

2015

pdf bib
Detection and Fine-Grained Classification of Cyberbullying Events
Cynthia Van Hee | Els Lefever | Ben Verhoeven | Julie Mennes | Bart Desmet | Guy De Pauw | Walter Daelemans | Veronique Hoste
Proceedings of the International Conference Recent Advances in Natural Language Processing

pdf bib
Towards a Model of Prediction-based Syntactic Category Acquisition: First Steps with Word Embeddings
Robert Grimm | Giovanni Cassani | Walter Daelemans | Steven Gillis
Proceedings of the Sixth Workshop on Cognitive Aspects of Computational Language Learning

pdf bib
Which distributional cues help the most? Unsupervised contexts selection for lexical category acquisition
Giovanni Cassani | Robert Grimm | Walter Daelemans | Steven Gillis
Proceedings of the Sixth Workshop on Cognitive Aspects of Computational Language Learning

2014

pdf bib
CLiPS Stylometry Investigation (CSI) corpus: A Dutch corpus for the detection of age, gender, personality, sentiment and deception in text
Ben Verhoeven | Walter Daelemans
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

pdf bib
The Strategic Impact of META-NET on the Regional, National and International Level
Georg Rehm | Hans Uszkoreit | Sophia Ananiadou | Núria Bel | Audronė Bielevičienė | Lars Borin | António Branco | Gerhard Budin | Nicoletta Calzolari | Walter Daelemans | Radovan Garabík | Marko Grobelnik | Carmen García-Mateo | Josef van Genabith | Jan Hajič | Inma Hernáez | John Judge | Svetla Koeva | Simon Krek | Cvetana Krstev | Krister Lindén | Bernardo Magnini | Joseph Mariani | John McNaught | Maite Melero | Monica Monachini | Asunción Moreno | Jan Odijk | Maciej Ogrodniczuk | Piotr Pęzik | Stelios Piperidis | Adam Przepiórkowski | Eiríkur Rögnvaldsson | Michael Rosner | Bolette Pedersen | Inguna Skadiņa | Koenraad De Smedt | Marko Tadić | Paul Thompson | Dan Tufiş | Tamás Váradi | Andrejs Vasiļjevs | Kadri Vider | Jolanta Zabarskaite
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

pdf bib
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Alessandro Moschitti | Bo Pang | Walter Daelemans
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Proceedings of the First Workshop on Computational Approaches to Compound Analysis (ComAComA 2014)
Ben Verhoeven | Walter Daelemans | Menno van Zaanen | Gerhard van Huyssteen
Proceedings of the First Workshop on Computational Approaches to Compound Analysis (ComAComA 2014)

pdf bib
Automatic Compound Processing: Compound Splitting and Semantic Analysis for Afrikaans and Dutch
Ben Verhoeven | Menno van Zaanen | Walter Daelemans | Gerhard van Huyssteen
Proceedings of the First Workshop on Computational Approaches to Compound Analysis (ComAComA 2014)

2013

pdf bib
A Self Learning Vocal Interface for Speech-impaired Users
Bart Ons | Netsanet Tessema | Janneke van de Loo | Jort Gemmeke | Guy De Pauw | Walter Daelemans | Hugo Van hamme
Proceedings of the Fourth Workshop on Speech and Language Processing for Assistive Technologies

2012

pdf bib
ConanDoyle-neg: Annotation of negation cues and their scope in Conan Doyle stories
Roser Morante | Walter Daelemans
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

pdf bib
“Vreselijk mooi!” (terribly beautiful): A Subjectivity Lexicon for Dutch Adjectives.
Tom De Smedt | Walter Daelemans
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

pdf bib
The Netlog Corpus. A Resource for the Study of Flemish Dutch Internet Language
Mike Kestemont | Claudia Peersman | Benny De Decker | Guy De Pauw | Kim Luyckx | Roser Morante | Frederik Vaassen | Janneke van de Loo | Walter Daelemans
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

pdf bib
Improving Topic Classification for Highly Inflective Languages
Jurgita Kapociute-Dzikiene | Frederik Vaassen | Walter Daelemans | Algis Krupavičius
Proceedings of COLING 2012

pdf bib
Towards a Self-Learning Assistive Vocal Interface: Vocabulary and Grammar Learning
Janneke van de Loo | Jort F. Gemmeke | Guy De Pauw | Joris Driesen | Hugo Van hamme | Walter Daelemans
Proceedings of the 1st Workshop on Speech and Multimodal Interaction in Assistive Environments

pdf bib
Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Walter Daelemans
Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
A Statistical Relational Learning Approach to Identifying Evidence Based Medicine Categories
Mathias Verbeke | Vincent Van Asch | Roser Morante | Paolo Frasconi | Walter Daelemans | Luc De Raedt
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

2011

pdf bib
Corpus-based approaches to processing the scope of negation cues: an evaluation of the state of the art
Roser Morante | Sarah Schrauwen | Walter Daelemans
Proceedings of the Ninth International Conference on Computational Semantics (IWCS 2011)

pdf bib
Automatic Emotion Classification for Interpersonal Communication
Frederik Vaassen | Walter Daelemans
Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA 2.011)

2010

pdf bib
Using Domain Similarity for Performance Estimation
Vincent Van Asch | Walter Daelemans
Proceedings of the 2010 Workshop on Domain Adaptation for Natural Language Processing

pdf bib
Memory-Based Resolution of In-Sentence Scopes of Hedge Cues
Roser Morante | Vincent Van Asch | Walter Daelemans
Proceedings of the Fourteenth Conference on Computational Natural Language Learning – Shared Task

2009

pdf bib
Prepositional Phrase Attachment in Shallow Parsing
Vincent Van Asch | Walter Daelemans
Proceedings of the International Conference RANLP-2009

pdf bib
Prototype-based Active Learning for Lemmatization
Walter Daelemans | Hendrik J. Groenewald | Gerhard B. van Huyssteen
Proceedings of the International Conference RANLP-2009

pdf bib
Is Sentence Compression an NLG task?
Erwin Marsi | Emiel Krahmer | Iris Hendrickx | Walter Daelemans
Proceedings of the 12th European Workshop on Natural Language Generation (ENLG 2009)

pdf bib
A Metalearning Approach to Processing the Scope of Negation
Roser Morante | Walter Daelemans
Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL-2009)

pdf bib
Learning the Scope of Hedge Cues in Biomedical Texts
Roser Morante | Walter Daelemans
Proceedings of the BioNLP 2009 Workshop

pdf bib
A memory-based learning approach to event extraction in biomedical texts
Roser Morante | Vincent Van Asch | Walter Daelemans
Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task

pdf bib
Reducing Redundancy in Multi-document Summarization Using Lexical Semantic Similarity
Iris Hendrickx | Walter Daelemans | Erwin Marsi | Emiel Krahmer
Proceedings of the 2009 Workshop on Language Generation and Summarisation (UCNLG+Sum 2009)

pdf bib
A Robust and Extensible Exemplar-Based Model of Thematic Fit
Bram Vandekerckhove | Dominiek Sandra | Walter Daelemans
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

2008

pdf bib
Learning the Scope of Negation in Biomedical Texts
Roser Morante | Anthony Liekens | Walter Daelemans
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

pdf bib
Personae: a Corpus for Author and Personality Prediction from Text
Kim Luyckx | Walter Daelemans
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

pdf bib
A Coreference Corpus and Resolution System for Dutch
Iris Hendrickx | Gosse Bouma | Frederik Coppens | Walter Daelemans | Veronique Hoste | Geert Kloosterman | Anne-Marie Mineur | Joeri Van Der Vloet | Jean-Luc Verschelde
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

pdf bib
Authorship Attribution and Verification with Many Authors and Limited Data
Kim Luyckx | Walter Daelemans
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib
CNTS: Memory-Based Learning of Generating Repeated References
Iris Hendrickx | Walter Daelemans | Kim Luyckx | Roser Morante | Vincent Van Asch
Proceedings of the Fifth International Natural Language Generation Conference

pdf bib
A Combined Memory-Based Semantic Role Labeler of English
Roser Morante | Walter Daelemans | Vincent Van Asch
CoNLL 2008: Proceedings of the Twelfth Conference on Computational Natural Language Learning

2007

pdf bib
Letter to the Editor
Walter Daelemans | Antal van den Bosch
Computational Linguistics, Volume 33, Number 1, March 2007

pdf bib
Invited talk: Text Analysis and Machine Learning for Stylometrics and Stylogenetics
Walter Daelemans
Proceedings of the 16th Nordic Conference of Computational Linguistics (NODALIDA 2007)

2006

pdf bib
Constraint Satisfaction Inference: Non-probabilistic Global Inference for Sequence Labelling
Sander Canisius | Antal van den Bosch | Walter Daelemans
Proceedings of the Workshop on Learning Structured Information in Natural Language Applications

pdf bib
A Mission for Computational Natural Language Learning
Walter Daelemans
Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL-X)

pdf bib
Investigating Lexical Substitution Scoring for Subtitle Generation
Oren Glickman | Ido Dagan | Walter Daelemans | Mikaela Keller | Samy Bengio
Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL-X)

pdf bib
A mixed word / morphological approach for extending CELEX for high coverage on contemporary large corpora
Joris Vaneyghen | Guy De Pauw | Dirk Van Compernolle | Walter Daelemans
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

2005

pdf bib
Improving Sequence Segmentation Learning by Predicting Trigrams
Antal van den Bosch | Walter Daelemans
Proceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL-2005)

2004

pdf bib
Unsupervised Text Mining for Ontology Extraction: An Evaluation of Statistical Measures
Marie-Laure Reinberger | Walter Daelemans
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

pdf bib
Evaluation and Adaptation of the Celex Dutch Morphological Database
Tom Laureys | Guy De Pauw | Hugo Van hamme | Walter Daelemans | Dirk Van Compernolle
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

pdf bib
Multimodal, Multilingual Resources in the Subtitling Process
Stelios Piperidis | Iason Demiros | Prokopis Prokopidis | Peter Vanroose | Anja Hoethker | Walter Daelemans | Elsa Sklavounou | Manos Konstantinou | Yannis Karavidas
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

pdf bib
Automatic Sentence Simplification for Subtitling in Dutch and English
Walter Daelemans | Anja Höthker | Erik Tjong Kim Sang
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

pdf bib
A Comparison of Two Different Approaches to Morphological Analysis of Dutch
Guy De Pauw | Tom Laureys | Walter Daelemans | Hugo Van hamme
Proceedings of the 7th Meeting of the ACL Special Interest Group in Computational Phonology: Current Themes in Computational Phonology and Morphology

pdf bib
GAMBL, genetic algorithm optimization of memory-based WSD
Bart Decadt | Véronique Hoste | Walter Daelemans | Antal van den Bosch
Proceedings of SENSEVAL-3, the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text

pdf bib
Memory-based semantic role labeling: Optimizing features, algorithm, and output
Antal van den Bosch | Sander Canisius | Walter Daelemans | Iris Hendrickx | Erik Tjong Kim Sang
Proceedings of the Eighth Conference on Computational Natural Language Learning (CoNLL-2004) at HLT-NAACL 2004

2003

pdf bib
Memory-Based Named Entity Recognition using Unannotated Data
Fien De Meulder | Walter Daelemans
Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003

pdf bib
Learning to Predict Pitch Accents and Prosodic Boundaries in Dutch
Erwin Marsi | Martin Reynaert | Antal van den Bosch | Walter Daelemans | Véronique Hoste
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics

2002

pdf bib
Dutch Word Sense Disambiguation: Optimizing the Localness of Context
Antal van den Bosch | Iris Hendrickx | Veronique Hoste | Walter Daelemans
Proceedings of the ACL-02 Workshop on Word Sense Disambiguation: Recent Successes and Future Directions

pdf bib
Evaluating the results of a memory-based word-expert approach to unrestricted word sense disambiguation
Veronique Hoste | Walter Daelemans | Iris Hendrickx | Antal van den Bosch
Proceedings of the ACL-02 Workshop on Word Sense Disambiguation: Recent Successes and Future Directions

pdf bib
Evaluation of Machine Learning Methods for Natural Language Processing Tasks
Walter Daelemans | Véronique Hoste
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

pdf bib
A Field Survey for Establishing Priorities in the Development of HLT Resources for Dutch
D. Binnenpoorte | F. De Vriend | J. Sturm | W. Daelemans | H. Strik | C. Cucchiarini
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

2001

pdf bib
Improving Accuracy in word class tagging through the Combination of Machine Learning Systems
Hans Van Halteren | Jakub Zavrel | Walter Daelemans
Computational Linguistics, Volume 27, Number 2, June 2001

pdf bib
Book Reviews: Learnability in Optimality Theory
Walter Daelemans
Computational Linguistics, Volume 27, Number 2, June 2001

pdf bib
Classifier Optimization and Combination in the English All Words Task
Véronique Hoste | Anne Kool | Walter Daelemans
Proceedings of SENSEVAL-2 Second International Workshop on Evaluating Word Sense Disambiguation Systems

2000

pdf bib
Bootstrapping a Tagged Corpus through Combination of Existing Heterogeneous Taggers
Jakub Zavrel | Walter Daelemans
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)

pdf bib
Part of Speech Tagging and Lemmatisation for the Spoken Dutch Corpus
Frank Van Eynde | Jakub Zavrel | Walter Daelemans
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)

pdf bib
A Rule Induction Approach to Modeling Regional Pronunciation Variation
Veronique Hoste | Steven Gillis | Walter Daelemans
COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics

pdf bib
Applying System Combination to Base Noun Phrase Identification
Erik F. Tjong Kim Sang | Walter Daelemans | Herve Dejean | Rob Koeling | Yuval Krymolowski | Vasin Punyakanok | Dan Roth
COLING 2000 Volume 2: The 18th International Conference on Computational Linguistics

pdf bib
The Role of Algorithm Bias vs Information Source in Learning Algorithms for Morphosyntactic Disambiguation
Guy De Pauw | Walter Daelemans
Fourth Conference on Computational Natural Language Learning and the Second Learning Language in Logic Workshop

pdf bib
Genetic Algorithms for Feature Relevance Assignment in Memory-Based Language Processing
Anne Kool | Walter Daelemans | Jakub Zavrel
Fourth Conference on Computational Natural Language Learning and the Second Learning Language in Logic Workshop

1999

pdf bib
Memory-Based Morphological Analysis
Antal van den Bosch | Walter Daelemans
Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics

pdf bib
Cascaded Grammatical Relation Assignment
Sabine Buchholz | Jorn Veenstra | Walter Daelemans
1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora

pdf bib
Memory-Based Shallow Parsing
Walter Daelemans | Sabine Buchholz | Jorn Veenstra
EACL 1999: CoNLL-99 Computational Natural Language Learning

1998

pdf bib
Abstraction Is Harmful in Language Learning
Walter Daelemans
New Methods in Language Processing and Computational Natural Language Learning

pdf bib
Modularity in Inductively-Learned Word Pronunciation Systems
Antal van den Bosch | Ton Weijters | Walter Daelemans
New Methods in Language Processing and Computational Natural Language Learning

pdf bib
Do Not Forget: Full Memory in Memory-Based Learning of Word Pronunciation
Antal van den Bosch | Walter Daelemans
New Methods in Language Processing and Computational Natural Language Learning

pdf bib
Improving Data Driven Wordclass Tagging by System Combination
Hans van Halteren | Jakub Zavrel | Walter Daelemans
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1

pdf bib
Improving Data Driven Wordclass Tagging by System Combination
Hans van Halteren | Jakub Zavrel | Walter Daelemans
COLING 1998 Volume 1: The 17th International Conference on Computational Linguistics

1997

pdf bib
Resolving PP attachment Ambiguities with Memory-Based Learning
Jakub Zavrel | Walter Daelemans | Jorn Veenstra
CoNLL97: Computational Natural Language Learning

pdf bib
Memory-Based Learning: Using Similarity for Smoothing
Jakub Zavrel | Walter Daelemans
35th Annual Meeting of the Association for Computational Linguistics and 8th Conference of the European Chapter of the Association for Computational Linguistics

1996

pdf bib
Unsupervised Discovery of Phonological Categories through Supervised Learning of Morphological Rules
Walter Daelemans | Peter Berck | Steven Gillis
COLING 1996 Volume 1: The 16th International Conference on Computational Linguistics

pdf bib
MBT: A Memory-Based Part of Speech Tagger-Generator
Walter Daelemans | Jakub Zavrel | Peter Berck | Steven Gillis
Fourth Workshop on Very Large Corpora

1994

pdf bib
The Acquisition of Stress: A Data-Oriented Approach
Walter Daelemans | Steven Gillis | Gert Durieux
Computational Linguistics, Volume 20, Number 3, September 1994

pdf bib
Book Reviews: Inheritance, Defaults, and the Lexicon
Walter Daelemans
Computational Linguistics, Volume 20, Number 4, December 1994

1993

pdf bib
Data-Oriented Methods for Grapheme-to-Phoneme Conversion
Antal van den Bosch | Walter Daelemans
Sixth Conference of the European Chapter of the Association for Computational Linguistics

1992

pdf bib
Inheritance in Natural Language Processing
Walter Daelemans | Koenraad De Smedt | Gerald Gazdar
Computational Linguistics, Volume 18, Number 2, Special Issue on Inheritance: I

1988

pdf bib
GRAFON: A Grapheme-to-Phoneme Conversion System for Dutch
Walter Daelemans
Coling Budapest 1988 Volume 1: International Conference on Computational Linguistics

Search
Co-authors