Markus Dickinson


2018

pdf bib
Annotating picture description task responses for content analysis
Levi King | Markus Dickinson
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications

Given that all users of a language can be creative in their language usage, the overarching goal of this work is to investigate issues of variability and acceptability in written text, for both non-native speakers (NNSs) and native speakers (NSs). We control for meaning by collecting a dataset of picture description task (PDT) responses from a number of NSs and NNSs, and we define and annotate a handful of features pertaining to form and meaning, to capture the multi-dimensional ways in which responses can vary and can be acceptable. By examining the decisions made in this corpus development, we highlight the questions facing anyone working with learner language properties like variability, acceptability and native-likeness. We find reliable inter-annotator agreement, though disagreements point to difficult areas for establishing a link between form and meaning.

2017

pdf bib
Gender Prediction for Chinese Social Media Data
Wen Li | Markus Dickinson
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

Social media provides users a platform to publish messages and socialize with others, and microblogs have gained more users than ever in recent years. With such usage, user profiling is a popular task in computational linguistics and text mining. Different approaches have been used to predict users’ gender, age, and other information, but most of this work has been done on English and other Western languages. The goal of this project is to predict the gender of users based on their posts on Weibo, a Chinese micro-blogging platform. Given issues in Chinese word segmentation, we explore character and word n-grams as features for this task, as well as using character and word embeddings for classification. Given how the data is extracted, we approach the task on a per-post basis, and we show the difficulties of the task for both humans and computers. Nonetheless, we present encouraging results and point to future improvements.

2016

pdf bib
Shallow Semantic Reasoning from an Incomplete Gold Standard for Learner Language
Levi King | Markus Dickinson
Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications

pdf bib
Cost-Effectiveness in Building a Low-Resource Morphological Analyzer for Learner Language
Scott Ledbetter | Markus Dickinson
Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications

pdf bib
A Multilinear Approach to the Unsupervised Learning of Morphology
Anthony Meyer | Markus Dickinson
Proceedings of the 14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology

2015

pdf bib
Automatic morphological analysis of learner Hungarian
Scott Ledbetter | Markus Dickinson
Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications

pdf bib
On Grammaticality in the Syntactic Annotation of Learner Language
Markus Dickinson | Marwa Ragheb
Proceedings of the 9th Linguistic Annotation Workshop

2014

pdf bib
IUCL: Combining Information Sources for SemEval Task 5
Alex Rudnick | Levi King | Can Liu | Markus Dickinson | Sandra Kübler
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

pdf bib
Leveraging known Semantics for Spelling Correction
Levi King | Markus Dickinson
Proceedings of the third workshop on NLP for computer-assisted language learning

2013

pdf bib
Towards Domain Adaptation for Parsing Web Data
Mohammad Khan | Markus Dickinson | Sandra Kübler
Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013

pdf bib
Detecting and Correcting Learner Korean Particle Omission Errors
Ross Israel | Markus Dickinson | Sun-Hee Lee
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf bib
Does Size Matter? Text and Grammar Revision for Parsing Social Media Data
Mohammad Khan | Markus Dickinson | Sandra Kuebler
Proceedings of the Workshop on Language Analysis in Social Media

pdf bib
Shallow Semantic Analysis of Interactive Learner Sentences
Levi King | Markus Dickinson
Proceedings of the Eighth Workshop on Innovative Use of NLP for Building Educational Applications

pdf bib
Inter-annotator Agreement for Dependency Annotation of Learner Language
Marwa Ragheb | Markus Dickinson
Proceedings of the Eighth Workshop on Innovative Use of NLP for Building Educational Applications

2012

pdf bib
Problems in Evaluating Grammatical Error Detection Systems
Martin Chodorow | Markus Dickinson | Ross Israel | Joel Tetreault
Proceedings of COLING 2012

pdf bib
Defining Syntax for Learner Language Annotation
Marwa Ragheb | Markus Dickinson
Proceedings of COLING 2012: Posters

pdf bib
Predicting Learner Levels for Online Exercises of Hebrew
Markus Dickinson | Sandra Kübler | Anthony Meyer
Proceedings of the Seventh Workshop on Building Educational Applications Using NLP

pdf bib
Sense-Specific Lexical Information for Reading Assistance
Soojeong Eom | Markus Dickinson | Rebecca Sachs
Proceedings of the Seventh Workshop on Building Educational Applications Using NLP

pdf bib
Developing Learner Corpus Annotation for Korean Particle Errors
Sun-Hee Lee | Markus Dickinson | Ross Israel
Proceedings of the Sixth Linguistic Annotation Workshop

pdf bib
Using semi-experts to derive judgments on word sense alignment: a pilot study
Soojeong Eom | Markus Dickinson | Graham Katz
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

The overall goal of this project is to evaluate the performance of word sense alignment (WSA) systems, focusing on obtaining examples appropriate to language learners. Building a gold standard dataset based on human expert judgments is costly in time and labor, and thus we gauge the utility of using semi-experts in performing the annotation. In an online survey, we present a sense of a target word from one dictionary with senses from the other dictionary, asking for judgments of relatedness. We note the difficulty of agreement, yet the utility in using such results to evaluate WSA work. We find that one's treatment of related senses heavily impacts the results for WSA.

pdf bib
Annotating Errors in a Hungarian Learner Corpus
Markus Dickinson | Scott Ledbetter
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We are developing and annotating a learner corpus of Hungarian, composed of student journals from three different proficiency levels written at Indiana University. Our annotation marks learner errors that are of different linguistic categories, including phonology, morphology, and syntax, but defining the annotation for an agglutinative language presents several issues. First, we must adapt an analysis that is centered on the morpheme rather than the word. Second, and more importantly, we see a need to distinguish errors from secondary corrections. We argue that although certain learner errors require a series of corrections to reach a target form, these secondary corrections, conditioned on those that come before, are our own adjustments that link the learner's productions to the target form and are not representative of the learner's internal grammar. In this paper, we report the annotation scheme and the principles that guide it, as well as examples illustrating its functionality and directions for expansion.

2011

pdf bib
Developing Methodology for Korean Particle Error Detection
Markus Dickinson | Ross Israel | Sun-Hee Lee
Proceedings of the Sixth Workshop on Innovative Use of NLP for Building Educational Applications

pdf bib
Detecting Dependency Parse Errors with Minimal Resources
Markus Dickinson | Amber Smith
Proceedings of the 12th International Conference on Parsing Technologies

2010

pdf bib
Evaluating Distributional Properties of Tagsets
Markus Dickinson | Charles Jochim
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

We investigate which distributional properties should be present in a tagset by examining different mappings of various current part-of-speech tagsets, looking at English, German, and Italian corpora. Given the importance of distributional information, we present a simple model for evaluating how a tagset mapping captures distribution, specifically by utilizing a notion of frames to capture the local context. In addition to an accuracy metric capturing the internal quality of a tagset, we introduce a way to evaluate the external quality of tagset mappings so that we can ensure that the mapping retains linguistically important information from the original tagset. Although most of the mappings we evaluate are motivated by linguistic concerns, we also explore an automatic, bottom-up way to define mappings, to illustrate that better distributional mappings are possible. Comparing our initial evaluations to POS tagging results, we find that more distributional tagsets can sometimes result in worse accuracy, underscring the need to carefully define the properties of a tagset.

pdf bib
Building a Korean Web Corpus for Analyzing Learner Language
Markus Dickinson | Ross Israel | Sun-Hee Lee
Proceedings of the NAACL HLT 2010 Sixth Web as Corpus Workshop

pdf bib
Consistency Checking for Treebank Alignment
Markus Dickinson | Yvonne Samuelsson
Proceedings of the Fourth Linguistic Annotation Workshop

pdf bib
Detecting Errors in Automatically-Parsed Dependency Relations
Markus Dickinson
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf bib
Generating Learner-Like Morphological Errors in Russian
Markus Dickinson
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

2009

pdf bib
Categorizing Local Contexts as a Step in Grammatical Category Induction
Markus Dickinson | Charles Jochim
Proceedings of the EACL 2009 Workshop on Cognitive Aspects of Computational Language Acquisition

pdf bib
Correcting Dependency Annotation Errors
Markus Dickinson
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

2008

pdf bib
Ad Hoc Treebank Structures
Markus Dickinson
Proceedings of ACL-08: HLT

pdf bib
Detecting Errors in Semantic Annotation
Markus Dickinson | Chong Min Lee
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We develop a method for detecting errors in semantic predicate-argument annotation, based on the variation n-gram error detection method. After establishing an appropriate data representation, we detect inconsistencies by searching for identical text with varying annotation. By remaining data-driven, we are able to detect inconsistencies arising from errors at lower layers of annotation.

pdf bib
A Simple Method for Tagset Comparision
Markus Dickinson | Charles Jochim
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

Based on the idea that local contexts predict the same basic category across a language, we develop a simple method for comparing tagsets across corpora. The principle differences between tagsets are evidenced by variation in categories in one corpus in the same contexts where another corpus exhibits only a single tag. Such mismatches highlight differences in the definitions of tags which are crucial when porting technology from one annotation scheme to another.

pdf bib
Developing Online ICALL Resources for Russian
Markus Dickinson | Joshua Herring
Proceedings of the Third Workshop on Innovative Use of NLP for Building Educational Applications

pdf bib
Representations for category disambiguation
Markus Dickinson
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

2006

pdf bib
From Detecting Errors to Automatically Correcting Them
Markus Dickinson
11th Conference of the European Chapter of the Association for Computational Linguistics

2005

pdf bib
“Language and Computers”: Creating an Introduction for a General Undergraduate Audience
Chris Brew | Markus Dickinson | W. Detmar Meurers
Proceedings of the Second ACL Workshop on Effective Tools and Methodologies for Teaching NLP and CL

pdf bib
Detecting Errors in Discontinuous Structural Annotation
Markus Dickinson | W. Detmar Meurers
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

2003

pdf bib
Detecting Errors in Part-of-Speech Annotation
Markus Dickinson | W. Detmar Meurers
10th Conference of the European Chapter of the Association for Computational Linguistics