Shared Task on NLP application to field linguistics - Field Matters

Event Notification Type: 
Call for Participation
Location: 
online
Submission Deadline: 
Wednesday, 10 August 2022

Field linguistics plays a crucial role in the development of linguistic theory and universal language modeling, as it provides uncontested, the only way to obtain structural data about the rapidly diminishing diversity of natural languages.

We offer two shared tasks on processing speech in field linguistic recordings. Linguistic data collection involves recording narratives, wordlists, and grammatical enqueting. Narratives are a priceless source of linguistic, anthropological and socio-cultural information. Wordlists are basic building blocks for everyone who studies the language, and for those who learn languages. Enquetes provide a unique view on what is plausible, and what is forbidden in a language, providing researchers with negative examples to adjust their theoretical models.

Automatic speech processing will optimize the time spent on language data treatment. In the shared tasks, we seek for means to reduce such a monotonous routine. We propose two tasks, targeting two stages of linguistic recordings annotation: diarization and transcription (ASR).

The evaluation process has already started and will last till August 10 (deadline may be extended). You can find the full schedule below:

  • Test data release; Evaluation start: July 17
  • Evaluation end: August 10
  • System description paper deadline: August 17
  • Deadline for reviews of system description papers: August 26
  • Author notifications: August 28
  • Camera-ready description paper deadline: September 5

For both tracks we prepared a baseline solution. You can find data, detailed description and baseline for the tasks on our website.