Call for Datasets and Papers
Many languages lack culturally-specific evaluation datasets that are created by language community members themselves. Thus, the shared task at the 5th Multilingual Representation Learning (MRL) Workshop is to contribute a manually-annotated physical commonsense reasoning evaluation dataset for your language(s), e.g. for researchers who speak non-English language(s) natively, or for researchers with close connections to native speakers of other languages.
We call for datasets with a format similar to PIQA, a physical commonsense reasoning benchmark where each example consists of a prompt ("goal") with two candidate completions ("solutions"). We aim to collaboratively construct a multilingual physical reasoning benchmark with broad language coverage and culturally-specific examples for different languages. All authors of accepted submissions will have the option to be included on the resulting benchmark paper. The dataset submission deadline is September 15, 2025, with acceptance notifications October 1, 2025. More information is on our shared task page: https://sigtyp.github.io/st2025-mrl.html