An Analysis of Dataset Overlap on Winograd-Style Tasks

Ali Emami; Kaheer Suleman; Adam Trischler; Jackie Chi Kit Cheung

doi:10.18653/v1/2020.coling-main.515

An Analysis of Dataset Overlap on Winograd-Style Tasks

Ali Emami, Kaheer Suleman, Adam Trischler, Jackie Chi Kit Cheung

Abstract

The Winograd Schema Challenge (WSC) and variants inspired by it have become important benchmarks for common-sense reasoning (CSR). Model performance on the WSC has quickly progressed from chance-level to near-human using neural language models trained on massive corpora. In this paper, we analyze the effects of varying degrees of overlaps that occur between these corpora and the test instances in WSC-style tasks. We find that a large number of test instances overlap considerably with the pretraining corpora on which state-of-the-art models are trained, and that a significant drop in classification accuracy occurs when models are evaluated on instances with minimal overlap. Based on these results, we provide the WSC-Web dataset, consisting of over 60k pronoun disambiguation problems scraped from web data, being both the largest corpus to date, and having a significantly lower proportion of overlaps with current pretraining corpora.

Anthology ID:: 2020.coling-main.515
Volume:: Proceedings of the 28th International Conference on Computational Linguistics
Month:: December
Year:: 2020
Address:: Barcelona, Spain (Online)
Editors:: Donia Scott, Nuria Bel, Chengqing Zong
Venue:: COLING
SIG:
Publisher:: International Committee on Computational Linguistics
Note:
Pages:: 5855–5865
Language:
URL:: https://aclanthology.org/2020.coling-main.515
DOI:: 10.18653/v1/2020.coling-main.515
Bibkey:
Cite (ACL):: Ali Emami, Kaheer Suleman, Adam Trischler, and Jackie Chi Kit Cheung. 2020. An Analysis of Dataset Overlap on Winograd-Style Tasks. In Proceedings of the 28th International Conference on Computational Linguistics, pages 5855–5865, Barcelona, Spain (Online). International Committee on Computational Linguistics.
Cite (Informal):: An Analysis of Dataset Overlap on Winograd-Style Tasks (Emami et al., COLING 2020)
Copy Citation:
PDF:: https://aclanthology.org/2020.coling-main.515.pdf
Data: GLUE, WSC, WinoGrande

PDF Cite Search