SemEval 2021 Task 10: Source-Free Domain Adaptation for Semantic Processing

Event Notification Type: 
Call for Participation
Location: 
SemEval 2021
Contact: 
Egoitz Laparra
Steve Bethard
Tim Miller
Özlem Uzuner
Submission Deadline: 
Sunday, 31 January 2021

We are pleased to announce a new shared task "SemEval 2021 Task 10: Source-Free Domain Adaptation for Semantic Processing", where participants perform unsupervised domain adaptation given only unlabeled target-domain data and a model pre-trained on source-domain data (i.e., no annotated source-domain data is available).

If you are interested in participating, please join our CodaLab competition and Google group:

Overview
Data sharing restrictions are common in NLP datasets. For example, Twitter policies do not allow sharing of tweet text, though tweet IDs may be shared. The situation is even more common in clinical NLP, where patient health information must be protected, and annotations over health text, when released at all, often require the signing of complex data use agreements. The SemEval-2021 Task 10 framework asks participants to develop semantic annotation systems in the face of data sharing constraints. A participant's goal is to develop an accurate system for a target domain when annotations exist for a related domain but cannot be distributed. Instead of annotated training data, participants are given a model trained on the annotations. Then, given unlabeled target domain data, they are asked to make predictions.

Tracks
We invite participation in two different semantic tasks: negation detection and time expression recognition.

  • Negation detection asks participants to classify clinical event mentions (e.g., diseases, symptoms, procedures, etc.) for whether they are being negated by their context. We expect most participants will treat this as a "span-in-context"' classification problem, where the model will jointly consider both the event to be classified and its surrounding context.
  • Time expression recognition asks participants to find time expressions in text. We expect most participants will treat this as a sequence classification problem, as in other named-entity tagging tasks.

Important Dates

  • 20 Aug 2020 Pre-trained models release
  • 3 Dec 2020 Test data release
  • 10 Jan 2021 Evaluation start
  • 31 Jan 2021 Evaluation end

Organizers

  • Egoitz Laparra, Yiyun Zhao, Steven Bethard (University of Arizona)
  • Tim Miller (Boston Children's Hospital and Harvard Medical School)
  • Özlem Uzuner (George Mason University)

Contact
e-mail: source-free-domain-adaptation-participants [at] googlegroups.com