WMT 2021 shared task: "Triangular MT: Using English to improve Russian-to-Chinese machine translation"

Event Notification Type: 
Call for Participation
Abbreviated Title: 
Triangular MT WMT 2021 shared task
Location: 
Collocated with EMNLP 2021 / virtual event
Wednesday, 10 November 2021 to Thursday, 11 November 2021
State: 
Country: 
Dominican Republic
City: 
Punta Cana / virtual event
Contact: 
Ajay Nagesh
Submission Deadline: 
Monday, 19 July 2021

Greetings! This is a call for participation for the shared task: "Triangular MT: Using English to improve Russian-to-Chinese machine translation".

Given a low-resource language pair (X/Y), the bulk of previous MT work has pursued one of two strategies.

- Direct: Collect parallel X/Y data from the web, and train an X-to-Y translator, OR
- Pivot: Collect parallel X/English and Y/English data (often much larger than X/Y data), train two translators (X-to-English + English-to-Y), and pipeline them to form an X-to-Y translator

However, there are many other possible strategies for combining such resources. These may involve, for example, ensemble methods, multi-source training methods, multi-target training methods, or novel data augmentation methods.

The goals of this shared task is to promote:

- translation between non-English languages,
- optimally mixing direct and indirect parallel resources, and
- exploting noisy, parallel web corpora

Task: Russian-to-Chinese machine translation

We provide three parallel corpora:

- Chinese/Russian: crawled from the web and aligned at the segment level, and combined with different public resources
- Chinese/English: combining several public resources
- Russian/English: combining several public resources

We evaluate system translations on a (secret) mixed-genre test set, drawn from the web and curated for high quality segment pairs. After receiving test data, participants have one week to submit translations. After all submissions are received, we will post a populated leaderboard that will continue to receive post-evaluation submissions.

To participate please register to the shared task on Codalab .

More details present at http://www.statmt.org/wmt21/triangular-mt-task.html

Important Dates

- Apr 5, 2021: Release of training and development resources
- Apr 5, 2021: Release of the baseline system
- Jul 12, 2021: Release of test data
- Jul 19, 2021: Official submissions due by web upload
- Jul 20, 2021: Release of the official results
- Aug 5, 2021: System description paper due
- Sep 5, 2021: Review feedback
- Sep 15, 2021: Camera-ready papers due
- Nov 10-11, 2021: Workshop

Ajay Nagesh
DiDi Labs, USA
Chair
Triangular MT shared task
WMT 2021