The human unlikeness of neural language models in next-word prediction

Cassandra L. Jacobs, Arya D. McCarthy


Abstract
The training objective of unidirectional language models (LMs) is similar to a psycholinguistic benchmark known as the cloze task, which measures next-word predictability. However, LMs lack the rich set of experiences that people do, and humans can be highly creative. To assess human parity in these models’ training objective, we compare the predictions of three neural language models to those of human participants in a freely available behavioral dataset (Luke & Christianson, 2016). Our results show that while neural models show a close correspondence to human productions, they nevertheless assign insufficient probability to how often speakers guess upcoming words, especially for open-class content words.
Anthology ID:
2020.winlp-1.29
Volume:
Proceedings of the Fourth Widening Natural Language Processing Workshop
Month:
July
Year:
2020
Address:
Seattle, USA
Editors:
Rossana Cunha, Samira Shaikh, Erika Varis, Ryan Georgi, Alicia Tsai, Antonios Anastasopoulos, Khyathi Raghavi Chandu
Venue:
WiNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
115
Language:
URL:
https://aclanthology.org/2020.winlp-1.29
DOI:
10.18653/v1/2020.winlp-1.29
Bibkey:
Cite (ACL):
Cassandra L. Jacobs and Arya D. McCarthy. 2020. The human unlikeness of neural language models in next-word prediction. In Proceedings of the Fourth Widening Natural Language Processing Workshop, page 115, Seattle, USA. Association for Computational Linguistics.
Cite (Informal):
The human unlikeness of neural language models in next-word prediction (Jacobs & McCarthy, WiNLP 2020)
Copy Citation:
Video:
 http://slideslive.com/38929570