1st Workshop on the Interplay of Model Behavior and Model Internals

June 17, 2025 | BY andreas.waldis

Event Notification Type:

Call for Papers

Location:

COLM 2025

Friday, 10 October 2025

Country:

Canada

Contact Email:

andreas.waldis@live.com

City:

Montréal

Contact:

Andreas Waldis

Website:

https://interplay-workshop.github.io/

Submission Deadline:

Monday, 23 June 2025

Overview

Language models have grown increasingly powerful at performing complex tasks, motivating the study of their behavior and internals. However, distinct research communities often pursue these two objectives in isolation. As a result, we lack robust and standardized interpretability methods to assess LM behavior in complex, real-world scenarios comprehensively. This workshop promotes research and discussion on the interplay between behavior and model internals to address this gap. We aim to explore how understanding internal mechanisms can enhance our knowledge of complex model behaviors, and vice versa.

Call for papers

We invite researchers working on either evaluating model behavior, internals, or both, to submit their work addressing (but not limited to) the following key questions:

How can we jointly evaluate model behavior and internals?
How do model interventions influence behavior, internals, and their interplay?
How can we disentangle the influence of internal dynamics from external behavior?
How do behavioral and internal evaluations align? Where and why do they differ?
How do model size, architecture, and pre-training data influence the link between internals and behavior?

Submission Format

1) Main Track

Empirical, methodological, or theoretical contributions
Short papers: up to 4 pages (excl. references and appendices)
Long papers: up to 9 pages (excl. references and appendices)

2) Flash Track

Early ideas or preliminary results
Up to 4 pages (excl. references and appendices)

3) Lessons Learned Track

Insights from negative or inconclusive experiments
Up to 4 pages (excl. references and appendices)

Dates

June 23 - Submission due
July 24 - Acceptance notification

We look forward to your submissions!

Organizers:
Leshem Choshen, Vagrant Gautam, Yufang Hou, Anne Lauscher, Tamar Rott Shaham, Andreas Waldis

Steering Committee:
Jacob Andreas, David Bau, Yonatan Belinkov, Iryna Gurevych, Kyle Mahowald

Menu

1st Workshop on the Interplay of Model Behavior and Model Internals

Latest Events

Menu

1st Workshop on the Interplay of Model Behavior and Model Internals

User login

Latest Events