The goals of BabyLM are to bring together multiple disciplines to answer an enduring question: how can a computational system learn language from limited inputs? Cognitive scientists investigate this question by trying to understand how humans learn their native language during childhood. Computer scientists tackle this question by attempting to build efficient machine-learning systems to accomplish this task. BabyLM brings these two communities together, asking how insights from cognitive science can be used to assemble more sample-efficient language models and how language modeling architectures can inspire research in cognitive science.
We call for both workshop papers and for researchers to join the 3rd BabyLM competition. As in previous years, we call for participants in the data-efficient pretraining challenge. We keep three tracks from prior years, including the strict (100M-word), strict-small (10M-word), and multimodal (100M-word and unlimited image) tracks. This year, we also offer a new track: Interaction. This new track encourages interaction with an LLM teacher during pre-training, including architectures and training pipelines that enable learning from the teacher, and adapting the teaching material to the student LM.
We also call for workshop papers (outside the competition). Areas for workshop paper submissions include the following:
* Data-efficient architectures and training techniques
* Data curation for efficient training
* Cognitively and linguistically inspired language modeling and evaluation
* Scaling laws; large and small model comparisons
* Cognitively inspired multimodal modeling or evaluation
The 2025 BabyLM Workshop will be co-located with EMNLP 2025 in Suzhou, China. Important dates include the following:
* February: Training data and call for papers released
* Late April: Evaluation pipeline released
* May 19: EMNLP ARR deadline
* July 31: EMNLP commitment deadline
* Early August: Direct submission deadline
* Mid-September: Direct submission decisions released
* Early October: Camera-ready papers due
* Early November: Workshop in Suzhou, China