CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP

Qinyuan Ye; Bill Yuchen Lin; Xiang Ren

doi:10.18653/v1/2021.emnlp-main.572

CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP

Abstract

Humans can learn a new language task efficiently with only few examples, by leveraging their knowledge obtained when learning prior tasks. In this paper, we explore whether and how such cross-task generalization ability can be acquired, and further applied to build better few-shot learners across diverse NLP tasks. We introduce CrossFit, a problem setup for studying cross-task generalization ability, which standardizes seen/unseen task partitions, data access during different learning stages, and the evaluation protocols. To instantiate different seen/unseen task partitions in CrossFit and facilitate in-depth analysis, we present the NLP Few-shot Gym, a repository of 160 diverse few-shot NLP tasks created from open-access NLP datasets and converted to a unified text-to-text format. Our analysis reveals that the few-shot learning ability on unseen tasks can be improved via an upstream learning stage using a set of seen tasks. We also observe that the selection of upstream learning tasks can significantly influence few-shot performance on unseen tasks, asking further analysis on task similarity and transferability.

Anthology ID:: 2021.emnlp-main.572
Volume:: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2021
Address:: Online and Punta Cana, Dominican Republic
Editors:: Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 7163–7189
Language:
URL:: https://aclanthology.org/2021.emnlp-main.572/
DOI:: 10.18653/v1/2021.emnlp-main.572
Bibkey:
Cite (ACL):: Qinyuan Ye, Bill Yuchen Lin, and Xiang Ren. 2021. CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 7163–7189, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):: CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP (Ye et al., EMNLP 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.emnlp-main.572.pdf
Software:: 2021.emnlp-main.572.Software.rar
Video:: https://aclanthology.org/2021.emnlp-main.572.mp4
Code: INK-USC/CrossFit + additional community code
Data: ANLI, CoLA, Quoref, RACE

PDF Cite Search Code Software Video Fix data