BlackboxNLP 2026: The 9th Workshop on Analyzing and Interpreting Neural Networks for NLP

Event Notification Type: 
Call for Papers
Abbreviated Title: 
BlackboxNLP 2026
Location: 
EMNLP 2026
State: 
Country: 
Hungary
Contact Email: 
City: 
Budapest
Contact: 
Hosein Mohebbi
Submission Deadline: 
Friday, 17 July 2026

Call for Papers: We are pleased to announce the ninth edition of the BlackboxNLP Workshop, co-located with EMNLP 2026 in Budapest, Hungary.

This year, we also invite submissions to our new Special Track on Reproducibility and Reliability in Interpretability Analyses (details below), aiming to advance rigorous and trustworthy research in NLP interpretability.

Important dates

*All deadlines are 11:59PM UTC-12:00 (“Anywhere on Earth”).

Overview

BlackboxNLP 2026 invites the submission of archival and non-archival papers featuring original and unpublished research on interpreting and explaining NLP models by taking inspiration from fields such as machine learning, psychology, linguistics, and neuroscience. We hope the workshop will serve as an interdisciplinary meetup that allows for cross-collaboration.
The topics of the workshop include, but are not limited to:

  • Adapting and applying analysis techniques from other disciplines, such as neuroscience, to analyze high-dimensional vector representations in artificial neural networks.
  • Examining model performance on simplified or formal languages.
  • Proposing architectural modifications to increase models’ interpretability.
  • Testing if interpretable information can be decoded from internal representations.
  • Open-source tools for analysis, visualization, or explanation to democratize access to interpretability techniques in NLP.
  • Meta-evaluation of analysis methods to assess their validity.
  • Understanding how and when language models rely on context information.
  • Analysing the linguistic properties captured by contextualised word representations.
  • Scaling up analysis methods for large language models (LLMs).
  • Mechanistic interpretability, reverse engineering approaches to understanding particular properties of neural models.
  • Evaluation of techniques for steering LLM output behavior.
  • Uncovering the reasoning processes of LLMs.
  • Understanding under the hood of memorization in LLMs.
  • Insights into LLM Failures.
  • Translating interpretability insights into practical solutions to address key challenges in NLP.
  • Opinion pieces about the state of interpretability in NLP.
  • Special Track

Special Track: Reproducibility and Reliability in Interpretability Analyses

Recent work has raised critical concerns about the significance and reproducibility of widely-reported interpretability results, suggesting that popular analysis methods can yield plausible-looking explanations even when applied to randomly initialized neural networks. Relatedly, “interpretability illusions” observed in analyses of models like BERT and GPT-2 suggest that some interesting phenomena might be limited in scope to specific datasets or models. While such incremental work is typically challenging to publish, we believe it is of great service to the interpretability research community. To promote high standards for the quality of new studies in this field, we introduce a special track inviting submissions of max. 6 pages focused on reproducing established interpretability results and complementing them with, e.g., rigorous statistical evaluations to assess the magnitude of reported effects, or additional experiments on previously untested models and datasets. We encourage, in particular, submissions that apply appropriate controls (e.g., random baselines), report effect sizes, and test generalization across datasets and model configurations.

Paper Submission Information
Submission Types

  • Archival papers of up to 8 pages + references. These are papers reporting on completed, original, and unpublished research. Papers shorter than this maximum are also welcome. An optional appendix may appear after the references in the same pdf file. Accepted papers are expected to be presented at the workshop and will be published in the workshop proceedings of the ACL Anthology, meaning they cannot be published elsewhere. They should report on obtained results rather than intended work. These papers will undergo double-blind peer-review, and should thus be anonymized.
  • Non-archival extended abstracts of 2 pages + references. These may report on work in progress or may be cross-submissions that have already appeared (or are scheduled to appear) in another venue. These submissions are non-archival and will not be included in the proceedings. The selection will not be based on a double-blind review and thus submissions of this type need not be anonymized.
  • Special track papers of up to 6 pages + references. These are papers focused on reproducing established interpretability results. See the Special Track section above for details. These papers will undergo double-blind peer-review, and should thus be anonymized.

*For Archival papers, we will accept direct submissions or ARR commitments through OpenReview.

*All submissions should use the *ACL template and formatting requirements, following the official EMNLP style guidelines. Archival paper must be fully anonymized.

*Accepted submissions for all tracks will be presented at the workshop: most as posters, some as oral presentations (determined by the program committee).

Organizers

  • Gabriele Sarti, Postdoc, Northeastern University
  • Hosein Mohebbi, AI Researcher at Whispp & Postdoc at University of Amsterdam
  • Martin Tutek, Postdoc, University of Zagreb
  • Dana Arad, PhD Candidate, Technion
  • Nils Feldhus, Postdoc, TU Berlin and BIFOLD
  • Hanjie Chen, Assistant Professor, Rice University
  • Aaron Mueller, Assistant Professor, Boston University
  • Yonatan Belinkov, Senior Lecturer, Technion

Contact

Please contact the organizers via email (blackboxnlp [at] googlegroups.com) for any questions.

Read more on the workshop's website:
https://blackboxnlp.github.io