Final Call for Papers:
1st Workshop on Evaluation and Comparison for NLP systems (Eval4NLP)
Website: https://nlpevaluation2020.github.io/
Contact: evaluation.nlp.workshop2020@gmail.com
=======================================
The 1st Workshop on Evaluation and Comparison for NLP systems (Eval4NLP), co-located at the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP2020), invites papers of a theoretical or experimental nature describing recent advances in system evaluation and comparison in NLP, particularly for text generation.
=======================================
Important Dates
2020-08-23: Deadline for submission **Extended deadline**
2020-09-15: Retraction of workshop papers
2020-09-29: Notification of acceptance
2020-10-10: Deadline for camera-ready version
2020-11-20: Workshop Date
=======================================
Overview
Fair evaluations and comparisons are of fundamental importance to the NLP community to properly track progress, especially within the current deep learning revolution, with new state-of-the-art results reported in ever shorter intervals. This concerns the designing of adequate metrics for evaluating performance in high-level text generation tasks such as question answering, dialogue, summarization, machine translation, image captioning, poetry generation, etc.; properly evaluating word and sentence embeddings; and rigorously determining whether and under which conditions one system is better than another; etc.
=======================================
Topics
- Novel individual evaluation metrics
- with desirable properties, e.g., high correlations with humans;
- reference-free evaluation metrics, defined in terms of the source text(s) and system predictions only;
- cross-domain metrics that can reliably and robustly measure the quality of system outputs from heterogeneous modalities;
- supervised, unsupervised, and semi-supervised metrics;
- Designing adequate evaluation methodology
- statistics for the trustworthiness of results, via appropriate significance tests
- reproducibility;
- comprehensive and fair comparisons;
- methodologies for human evaluation;
- validation of metrics against human evaluations;
- Creating adequate and correct evaluation data
- coverage of phenomena, representativeness/balance/distribution with respect to the task, etc.;
- size of corpora, variability among data sources, eras, genres, etc.;
- system evaluation using appropriate annotations;
- cost-effective manual evaluations with a good inter-annotator agreement;
- introspection and elimination of biases in the annotated data, e.g., via probing and adversarial attacks
=======================================
Submission Guidelines:
Authors should submit a long paper of up to 8 pages, with up to 2 additional pages for references, or a short paper of up to 4 pages, with up to 2 additional pages for references, following the EMNLP 2020 formatting requirements. The reported research should be substantially original. Reviewing will be double-blind, and thus no author information should be included in the papers; self-reference that identifies the authors should be avoided or anonymized. Accepted papers will be published in the workshop proceedings and included in the ACL anthology.
The submission site is in https://www.softconf.com/emnlp2020/nlpevaluation2020/
**Dual submission at EMNLP 2020 is considered. If the paper is accepted at the main conference, please withdraw the workshop paper by 15th, September**
=======================================
Steering Committee:
Ido Dagan, Bar-Ilan University
Ani Nenkova, University of Pennsylvania
Robert West, École polytechnique fédérale de Lausanne (EPFL)
Mohit Bansal, University of North Carolina (UNC) Chapel Hill
=======================================
Organizing Committee
Eduard Hovy, Carnegie Mellon University
Steffen Eger, Technische Universität Darmstadt
Yang Gao, Royal Holloway, University of London
Maxime Peyrard, École polytechnique fédérale de Lausanne (EPFL)
Wei Zhao, Technische Universität Darmstadt