2nd Workshop on Multimodal Augmented Generation via MultimodAl Retrieval

Event Notification Type: 
Call for Papers
Abbreviated Title: 
MAGMaR
Location: 
ACL 2026
Saturday, 4 July 2026
State: 
California
Country: 
United States
City: 
San Diego
Contact: 
Reno Kriz
Kenton Murray
Submission Deadline: 
Wednesday, 1 April 2026

Audiovisual media is becoming an increasingly dominant form of online information consumption. From firsthand, “in the wild” video footage of natural disasters to professionally edited news coverage of major political events, videos serve as rich sources of information for producing factual, grounded articles. Especially for actively unfolding events, grounding articles in video can help combat misinformation and provide journalists and analysts with tools to quickly synthesize new developments.

Individual research groups have independently begun addressing this challenge, leading to parallel yet disconnected efforts to define the research space. ACL 2025 hosted the first MAGMaR workshop focused on Video Event Retrieval. This year’s iteration focuses on two primary areas: (1) the retrieval of multimodal content spanning text, images, audio, and video; and (2) retrieval-augmented generation, with an emphasis on multimodal retrieval and grounded generation. To further this goal, we are again hosting a shared task, extending this year to full grounded article generation from multiple videos.

Relevant topics include document retrieval, multimodal retrieval, retrieval-augmented generation (RAG), multimodal RAG, multimodal question answering, and research on video, image, and audio understanding.

This workshop is organized in support of ACL's Special Interest Group on Image and Language (SIGIL).

The workshop will be a one-day hybrid event to allow remote participation and will be co-located with ACL 2026 in San Diego, USA on July 4th.