First CfP: Aggregating and analysing crowdsourced annotations for NLP (AnnoNLP)

Event Notification Type: 
Call for Papers
Abbreviated Title: 
AnnoNLP
Location: 
EMNLP-IJCNLP
Sunday, 3 November 2019
State: 
Country: 
City: 
Hong Kong
Contact: 
Silviu Paun
Dirk Hovy
Submission Deadline: 
Monday, 2 September 2019

Aggregating and analysing crowdsourced annotations for NLP (AnnoNLP)
The workshop is hosted by EMNLP-IJCNLP 2019 which will take place in Hong Kong. Full details here: http://dali.eecs.qmul.ac.uk/annonlp

Background
Crowdsourcing, whether through microwork platforms or through Games with a Purpose, is increasingly used as an alternative to traditional expert annotation, achieving comparable annotation quality at lower cost and offering greater scalability. The NLP community has enthusiastically adopted crowdsourcing to support work in tasks such as coreference resolution, sentiment analysis, textual entailment, named entity recognition, word similarity, word sense disambiguation, and many others. This interest has also resulted in the organization of a number of workshops at ACL and elsewhere, from as early as “The People’s Web meets NLP” in 2009. These days, general purpose research on crowdsourcing can be presented at HCOMP or CrowdML, but the need for workshops more focused on the use of crowdsourcing in NLP remains. In particular, NLP-specific methods are typically required for the task of aggregating the interpretations provided by the annotators. Most existing work on aggregation methods is based on a common set of assumptions: 1) independence between the true classes, 2) the set of classes the coders can choose from is fixed across the annotated items, and 3) there is one true class per item. However, for many NLP tasks such assumptions are not entirely appropriate. For example, sequence labelling tasks (e.g., NER, tagging) have an implicit inter-label dependence. In other tasks such as coreference the labels the coders can choose from are not fixed but depend on the mentions from each document. Furthermore, in many NLP tasks, the data items can have more than one interpretation. Such cases of ambiguity also affect the reliability of existing gold standard datasets (often labelled with a single interpretation even though expert disagreement is a well-known issue). This former point motivates the research on alternative, complementary evaluation methods, but also the development of multi-label datasets. More broadly, the proposed workshop aims to bring together researchers interested in methods for aggregating and analysing crowdsourced data for NLP-specific tasks which relax the aforementioned assumptions. We also invite work on ambiguous, subjective or complex annotation tasks which received less attention in the literature.

Objectives
Although there is a large body of work analysing crowdsourced data, be that probabilistic (models of annotation) or traditional (majority voting aggregation, agreement statistics), there has been less work devoted to NLP tasks. It is often the case that NLP data violate the assumptions made by most existing models, opening the path to new research. The aim of the proposed workshop is to bring together the community of researchers interested in this area.

Topics
Topics of interest include but are not limited to the following:

  • Label aggregation methods for NLP tasks
    • Sequential labelling tasks (e.g., NER, chunking)
    • Tasks where the set of labels is not fixed across the data items (e.g., coreference)
    • Other case of complex labels (e.g. for syntactic annotation)
  • The effects of ambiguity
    • Allowing for multiple interpretations per data item
    • Assessing the reliability of existing gold standard datasets
      • New evaluation methodologies
      • New multi-label datasets
  • Subjective, complex tasks
    • Can the crowd successfully annotate such tasks? How to design the task to
      facilitate the annotation process?

Important Dates
May 13, 2019: First call for papers
June 14, 2019: Second call for papers
September 2, 2019: Submission deadline
September 16, 2019: Notification of acceptance
September 30, 2019: Camera-ready papers due
November 3/4, 2019: Workshop date

Submission Details
We welcome both long and short papers. Long papers are expected to have at most 8 pages of content, while short papers should have up to 4 content pages. Both can include unlimited references. Submissions should follow the EMNLP-IJCNLP 2019 guidelines.
Papers should be submitted online via START: https://www.softconf.com/emnlp2019/ws-AnnoNLP

Workshop Organizers
Silviu Paun, Queen Mary University of London
Dirk Hovy, Bocconi University

Invited Speakers
Jordan Boyd-Graber, University of Maryland
Edwin Simpson, Technische Universität Darmstadt

Programme Committee
Omar Alonso, Microsoft (TBC)
Beata Beigman Klebanov, Princeton
Bob Carpenter, Columbia University
Jon Chamberlain, University of Essex
Anca Dumitrache, Vrije Universiteit Amsterdam
Paul Felt, IBM
Udo Kruschwitz, University of Essex
Matthew Lease, University of Texas at Austin
Massimo Poesio, Queen Mary University of London
Vikas C Raykar, IBM
Edwin Simpson, Technische Universität Darmstadt
Henning Wachsmuth, Universität Paderborn
Yudian Zheng, Twitter