2nd IWPT Shared Task on Enhanced Universal Dependencies Parsing

Event Notification Type: 
Call for Participation
Abbreviated Title: 
IWPT ST EUD Parsing
Sunday, 1 August 2021
Country: 
Thailand
Contact Email: 
City: 
Bangkok
Contact: 
Djamé Seddah
Daniel Zeman
Gosse Bouma
Submission Deadline: 
Saturday, 22 May 2021

============== IWPT 2021 EUD SHARED TASK ==============
the IWPT 2021 conference (https://iwpt21.sigparse.org),
collocated with ACL 2021, hosts the 2nd Shared Task on
Enhanced Dependencies Parsing

** updates:
- schedule corrected (now synced with website)
- a new data set is included (English Gum Treebank)
- registration via the shared-task mailing list

web: https://universaldependencies.org/iwpt21/

===== Summary =====
Following the success of the 1st IWPT Enhanced Universal Dependency
Parsing in 2020, a second edition of the IWPT Parsing Shared task
is launched with a focus an emphasis on the parsing of Enhanced
Universal Dependencies (often reflecting deeper syntactic structures,
represented as more complex graphs, than regular surface dependencies).

https://iwpt21.sigparse.org/sharedtask
Webpage: https://universaldependencies.org/iwpt21/

Release of training and dev. data : April 6
Test data release: May 1
Test submission deadline: May 22
System paper deadline: June 6
Camera Ready Deadline: June 30
Timezone: Anytime on Earth (CET-12)

Interested parties are encouraged to subscribe to the shared task mailing
list at http://sympa.inria.fr/sympa/info/iwptsharedtask.

===== Introduction =====

The IWPT 2021 Shared Task will be on Multilingual Parsing into
Enhanced Universal Dependencies (EUD). In recent years, Universal
Dependencies (UD)—the de-facto standard target representations in
surface-syntactic dependency parsing—have grown a second layer of
structure, called enhanced dependencies, where grammatical relations
that cannot be adequately represented in pure rooted trees are
encoded, for example control relations and argument sharing in
relative clauses, shared dependencies involving coordinate structures,
and dependencies involving ellipsis. Enhanced dependencies call for
non-tree graphs with reentrancies, cycles, and empty nodes.

Data for the shared task consists of at least the treebanks in UD
release 2.5. that contain enhanced annotation, and potentially one
or more additional languages/treebanks. The task will be parsing
from raw strings into EUD according to the guidelines at
https://universaldependencies.org/u/overview/enhanced-syntax.html. On
top of a classic F-measure metrics, evaluation will measure performance
per phenomenon and will take into account the fact that not all
treebanks cover all of the phenomena listed in the EUD guidelines.

===== Task Description and Evaluation =====

We invite participants to develop a system for parsing raw text
into enhanced universal dependencies for all of the languages
included in the training data. The task is similar to that of the
CoNLL 2017 and 2018 shared tasks on parsing into UD, except that
the prime evaluation metric now is the enhanced dependency annotation.
Participants are encouraged to consider all enhancements listed in
the UD guidelines, even if some of these enhancements might be
absent in some of the treebanks included in the training data.
Evaluation will take into account the fact that some treebanks are
incomplete in this respect. Participants are also encouraged to
predict all lower levels of annotation (lemma, tag, morphological
features, basic dependency tree). These annotations will be evaluated
as secondary metrics. Also, it is possible to train a pre-existing
parser (such as UDPipe), use it to predict the lower levels of
annotation and then develop one’s own system that focuses on the
transition from the basic UD tree to the enhanced UD graph.

== Training Data ==

The evaluation will be done on 17 languages from 4 language families:
Arabic, Bulgarian, Czech, Dutch, English, Estonian, Finnish, French,
Italian, Latvian, Lithuanian, Polish, Russian, Slovak, Swedish,
Tamil, Ukrainian. The language selection is driven simply by the
fact that at least partial enhanced representation is available for
the given language. Training and development data in the CoNLL-U
format are available on the shared task website. These datasets are
based on the UD release 2.7 but the annotation is often not identical
to the corresponding treebank in UD 2.7. Nevertheless, the participants
are also allowed to use the training and development data from the
official UD 2.7 release package on Lindat, even in languages that
are not part of the shared task evaluation. No other version of
UD (either previous releases or Github repositories or other copies
and clones online) can be used in the shared task; this is to avoid
the danger of incompatible training-test splits.

== Evaluation Metric ==

The prime evaluation metric is LAS on enhanced dependencies (DEPS),
where LAS is defined as F1-score over the set of enhanced dependencies
in the system output and the gold standard. Complete edge labels
are taken into account, i.e. conj:and differs from conj. While
some effort has gone into ensuring that the data in the various
treebanks is annotated consistently w.r.t. the level of enhancements
and the format of enhanced labels, not all treebanks include all
of the enhancements listed in the UD guidelines. For those treebanks,
an additional evaluation will be carried out, where dependencies
that are the consequence of including enhancement E, where E is not
included in the training data of that treebank, are ignored during
evaluation.

===== Shared Task Schedule =====

Release of training and dev. data : April 6
Test data release: May 1
Test submission deadline: May 22
System paper deadline: June 6
Camera Ready Deadline: June 30
Timezone: Anytime on Earth (CET-12)

Shared Task papers will be published in the "Working notes of the
IWPT 2021 Enhanced Dependency Parsing Shared Task".

===== Shared task Organizers =====

Gosse Bouma (University of Groningen, Netherlands)
Djamé Seddah (Inria Paris, France)
Daniel Zeman (Charles University, Czechia)

===== IWPT 2021 Organizers =====

Yuji Matsumoto
Stephan Oepen
Kenji Sagae
Weiwei Sun
Anders Søgaard
Reut Tsarfaty

===== Contact details =====
- Mail: iwptsharedtask@gmail.com
- Webpage: https://universaldependencies.org/iwpt21/
- Mailing list: https://sympa.inria.fr/sympa/info/iwptsharedtask
- IWPT 2020 website: https://iwpt21.sigparse.org