BlackboxNLP 2025: The 8th Workshop on Analyzing and Interpreting Neural Networks for NLP

May 21, 2025 | BY hmohebbi

Event Notification Type:

Call for Papers

Abbreviated Title:

BlackboxNLP 2025

Location:

co-located with EMNLP 2025

Country:

China

Contact Email:

blackboxnlp@googlegroups.com

City:

Suzhou

Contact:

Yonatan Belinkov

Aaron Mueller

Najoung Kim

Hanjie Chen

Hosein Mohebbi

Gabriele Sarti

Dana Arad

Website:

https://blackboxnlp.github.io/

Submission Deadline:

Friday, 15 August 2025

The 8th Workshop on Analyzing and Interpreting Neural Networks for NLP will be co-located with EMNLP 2025 in Suzhou, China, this November!
* BlackboxNLP welcomes contributors to submit either archival papers (reporting completed and original research) or non-archival extended abstracts (work in progress or cross-submissions), details: https://blackboxnlp.github.io/2025/call/
* This edition will also host a shared task for benchmarking new techniques for localizing circuits and causal latent variables in language models, details: http://blackboxnlp.github.io/2025/task

Workshop description
-----------------
Many recent performance improvements in NLP have come at the cost of understanding of the systems. How do we assess what representations and computations models learn? How do we formalize desirable properties of interpretable models, and measure the extent to which existing models achieve them? How can we build models that better encode these properties? What can new or existing tools tell us about these systems’ inductive biases?
The goal of this workshop is to bring together researchers focused on interpreting and explaining NLP models by taking inspiration from fields such as machine learning, psychology, linguistics, and neuroscience. We hope the workshop will serve as an interdisciplinary meetup that allows for cross-collaboration.

Topics of interest include, but are not limited to:
* Mechanistic interpretability, reverse engineering approaches to understanding particular properties of neural models.
* Understanding how language models use context by measuring their context-mixing processes (e.g., their token-to-token interactions)
* Scaling up analysis methods for large language models (LLMs)
* Analyzing techniques for steering LLM output behavior
* Probing methods for testing whether models have acquired or represent certain linguistic properties.
* Adapting and applying analysis techniques from other disciplines (e.g., neuroscience or computer vision).
* Examining model performance on simplified or formal languages.
* Proposing modifications to neural architectures that increase their interpretability.
* Open-source tools for analysis, visualization, or explanation to democratize access to interpretability techniques in NLP.
* Explanation methods such as saliency, attribution, free-text explanations, or explanations with structured properties.
* Evaluation of explanation methods: how do we know the explanation is faithful to the model?
* Uncovering the reasoning processes of LLMs
* Understanding under the hood of memorization in LLMs
* Insights into LLM Failures
* Opinion pieces about the state of explainable NLP.

Submission types
-----------------
We call for two types of papers:
* Archival papers of up to 8 pages + references. These are papers reporting on completed, original, and unpublished research. Papers shorter than this maximum are also welcome. An optional appendix may appear after the references in the same pdf file. Accepted papers are expected to be presented at the workshop and will be published in the workshop proceedings of the ACL Anthology, meaning they cannot be published elsewhere. They should report on obtained results rather than intended work. These papers will undergo double-blind peer-review, and should thus be anonymized.
* Non-archival extended abstracts of 2 pages + references. These may report on work in progress or may be cross-submissions that have already appeared (or are scheduled to appear) in another venue. These submissions are non-archival and will not be included in the proceedings. The selection will not be based on a double-blind review and thus submissions of this type need not be anonymized.

Accepted submissions for both tracks will be presented at the workshop: most as posters, some as oral presentations (determined by the program committee).

Organizers
-----------------
Yonatan Belinkov, Technion
Aaron Mueller, Boston University
Najoung Kim, Boston University
Hanjie Chen, Rice University
Hosein Mohebbi, Tilburg University
Gabriele Sarti, University of Groningen
Dana Arad, Technion

Contact
-----------------
Please contact the organizers at blackboxnlp@googlegroups.com for any
questions. For detailed information, including important dates, program, and recent announcements, please visit our website: https://blackboxnlp.github.io/.

Menu

BlackboxNLP 2025: The 8th Workshop on Analyzing and Interpreting Neural Networks for NLP

Latest Events

Menu

BlackboxNLP 2025: The 8th Workshop on Analyzing and Interpreting Neural Networks for NLP

User login

Latest Events