Long-Tail Predictions with Continuous-Output Language Models

Shiran Dudy, Steven Bedrick


Abstract
Neural language models typically employ a categorical approach to prediction and training, leading to well-known computational and numerical limitations. An under-explored alternative approach is to perform prediction directly against a continuous word embedding space, which according to recent research is more akin to how lexemes are represented in the brain. Choosing this method opens the door for for large-vocabulary, language models and enables substantially smaller and simpler computational complexities. In this research we explore a different important trait - the continuous output prediction models reach low-frequency vocabulary words which we show are often ignored by the categorical model. Such words are essential, as they can contribute to personalization and user vocabulary adaptation. In this work, we explore continuous-space language modeling in the context of a word prediction task over two different textual domains (newswire text and biomedical journal articles). We investigate both traditional and adversarial training approaches, and report results using several different embedding spaces and decoding mechanisms. We find that our continuous-prediction approach outperforms the standard categorical approach in terms of term diversity, in particular with rare words.
Anthology ID:
2020.winlp-1.31
Volume:
Proceedings of the Fourth Widening Natural Language Processing Workshop
Month:
July
Year:
2020
Address:
Seattle, USA
Editors:
Rossana Cunha, Samira Shaikh, Erika Varis, Ryan Georgi, Alicia Tsai, Antonios Anastasopoulos, Khyathi Raghavi Chandu
Venue:
WiNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
119–122
Language:
URL:
https://aclanthology.org/2020.winlp-1.31
DOI:
10.18653/v1/2020.winlp-1.31
Bibkey:
Cite (ACL):
Shiran Dudy and Steven Bedrick. 2020. Long-Tail Predictions with Continuous-Output Language Models. In Proceedings of the Fourth Widening Natural Language Processing Workshop, pages 119–122, Seattle, USA. Association for Computational Linguistics.
Cite (Informal):
Long-Tail Predictions with Continuous-Output Language Models (Dudy & Bedrick, WiNLP 2020)
Copy Citation:
Video:
 http://slideslive.com/38929571