Shared Task on Detecting Entities in the Astrophysics Literature (DEAL) at WIESP in AACL-IJCNLP 2022

Event Notification Type: 
Call for Participation
Abbreviated Title: 
DEAL @ WIESP @ AACL-IJCNLP 2022
Location: 
Online (AACL-IJCNLP 2022)
Sunday, 20 November 2022
Contact Email: 
Contact: 
Tirthankar Ghosal
Submission Deadline: 
Wednesday, 10 August 2022

***Shared Task: Detecting Entities in the Astrophysics Literature (DEAL)***

***Website: https://ui.adsabs.harvard.edu/WIESP/2022/SharedTasks ***
***Twitter: https://twitter.com/wiesp_nlp ***

A good amount of astrophysics research makes use of data coming from missions and facilities such as ground observatories in remote locations or space telescopes, as well as digital archives that hold large amounts of observed and simulated data. These missions and facilities are frequently

named after historical figures or use some ingenious acronym which, unfortunately, can be easily confused when searching for them in the literature via simple string matching. For instance, Planck can refer to the person, the mission, the constant, or several institutions. Automatically recognizing entities such as missions or facilities would help tackle this word sense disambiguation problem.

The shared task consists of Named Entity recognition (NER) on samples of text extracted from astrophysics publications. The labels were created by domain experts and designed to identify entities of interest to the astrophysics community. They range from simple to detect (ex: URLs) to highly unstructured (ex: Formula), and from useful to researchers (ex: Telescope) to more useful to archivists and administrators (ex: Grant). Overall 31 different labels are included, and their distribution is highly unbalanced (ex: ~100x more Citations than Proposals). Submissions will be scored using both the CoNLL-2000 shared task seqeval F1-Score at the entity level, and scikit-learn's Matthews correlation coefficient method at the token level. We also encourage authors to propose their own evaluation metrics. A sample dataset and more instructions can be found at: https://ui.adsabs.harvard.edu/WIESP/2022/SharedTasks

Participants (individuals or groups) will have the opportunity to present their findings during the workshop and write a short paper. The best performant or interesting approaches might be invited to further collaborate with the NASA Astrophysics Data System (https://ui.adsabs.harvard.edu/).

The DEAL shared task is a part of the 1st Workshop on Information Extraction from Scientific Publications (WIESP) at AACL-IJCNLP 2022:

https://ui.adsabs.harvard.edu/WIESP/2022/

***Please fill in this form to report your intention to participate in the shared task***

https://forms.office.com/r/KKpeKJBLy3

***Shared Task Submission***

Link to data and scoring scripts: https://huggingface.co/datasets/fgrezes/WIESP2022-NER
CodaLab Link to the online competition : https://codalab.lisn.upsaclay.fr/competitions/5062

***Important Dates***

Training+Validation Data Release: June 1, 2022
Validation Phase: June 1 - July 31, 2022
Test Data Release: August 1, 2022
Final Scoring Period: August 1 - August 10, 2022
System Report Submission: August 25, 2022
Notification: September 25, 2022
Camera-ready Submission Deadline: October 10, 2022
Event Date: November 20, 2022 (online)

***All submission deadlines are 11.59 pm UTC -12h (“Anywhere on Earth”)***

WIESP @ AACL-IJCNLP 2022 --> https://ui.adsabs.harvard.edu/WIESP/2022/

***Organizers***
- Tirthankar Ghosal, Charles University, CZ
- Sergi Blanco-Cuaresma, Center for Astrophysics | Harvard & Smithsonian, USA
- Alberto Accomazzi, Center for Astrophysics | Harvard & Smithsonian, USA
- Robert M. Patton, Oak Ridge National Laboratory, USA
- Felix Grezes, Center for Astrophysics | Harvard & Smithsonian, USA
- Thomas Allen, Center for Astrophysics | Harvard & Smithsonian, USA