We are pleased to announce the second round of the Model Compression Shared Task at WMT 2026.
This shared task aims to evaluate the potential of model compression techniques in reducing the size of general-purpose large language models, with the goal of achieving an optimal balance between practical deployability and high translation quality in specific machine translation (MT) scenarios. The task’s broader objectives include fostering research into the efficient, accessible, and sustainable deployment of LLMs for MT, establishing a common evaluation framework to monitor progress in model compression across a wide range of languages, and enabling meaningful comparisons with state-of-the-art MT systems through standardized evaluation protocols designed to assess not only translation quality but also computational efficiency.
Although the focus is on model compression, the task is closely aligned with the General MT shared task, sharing test data from a subset of its language directions, as well as protocols for automatic MT quality evaluation. Additionally, the task follows the same timeline as the flagship WMT task.
We warmly invite participation from academic teams and industry players interested in applying existing compression methods to MT or exploring innovative, cutting-edge approaches.
THE TASK IN A NUTSHELL
Goal: Reduce the size of a general-purpose LLM while maintaining a balance between model compactness and MT performance.
Languages: The second round of the task will focus on a subset of the languages covered by the General MT task, namely: Czech to German, English to Chinese (Simplified), and English to Arabic (Egyptian).
Conditions:
Constrained: Participants will compress a specific model, using a predefined pool of data for calibration and fine-tuning (if needed) to ensure directly comparable results.
Unconstrained: Participants are free to compress any model, provided its original size is below 20B parameters, and use any additional data for calibration and fine-tuning.
Participation format: Participants will share their compressed models to be run on a standardized hardware environment provided by the organizers.
Evaluation Criteria:
Translation quality: Automatically assessed using multiple metrics, e.g. Comet, MetricX, and an LLM-as-a-judge framework.
Model size: Defined by memory usage.
Inference speed: Measured by total processing time over the test set.
IMPORTANT DATES
Test data released: June 18, 2026
Model Submission deadline: July 2, 2026
System description paper submission: in line with WMT26
Camera-ready submission: in line with WMT26
WMT 2026 Conference (co-located with EMNLP2026 in Budapest, Hungary): November, 2026