Do Language Embeddings capture Scales?

Xikun Zhang, Deepak Ramachandran, Ian Tenney, Yanai Elazar, Dan Roth


Abstract
Pretrained Language Models (LMs) have been shown to possess significant linguistic, common sense and factual knowledge. One form of knowledge that has not been studied yet in this context is information about the scalar magnitudes of objects. We show that pretrained language models capture a significant amount of this information but are short of the capability required for general common-sense reasoning. We identify contextual information in pre-training and numeracy as two key factors affecting their performance, and show that a simple method of canonicalizing numbers can have a significant effect on the results.
Anthology ID:
2020.findings-emnlp.439
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2020
Month:
November
Year:
2020
Address:
Online
Editors:
Trevor Cohn, Yulan He, Yang Liu
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4889–4896
Language:
URL:
https://aclanthology.org/2020.findings-emnlp.439
DOI:
10.18653/v1/2020.findings-emnlp.439
Bibkey:
Cite (ACL):
Xikun Zhang, Deepak Ramachandran, Ian Tenney, Yanai Elazar, and Dan Roth. 2020. Do Language Embeddings capture Scales?. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 4889–4896, Online. Association for Computational Linguistics.
Cite (Informal):
Do Language Embeddings capture Scales? (Zhang et al., Findings 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.findings-emnlp.439.pdf