A Corpus of Encyclopedia Articles with Logical Forms

Nathan Rasmussen, William Schuler


Abstract
People can extract precise, complex logical meanings from text in documents such as tax forms and game rules, but language processing systems lack adequate training and evaluation resources to do these kinds of tasks reliably. This paper describes a corpus of annotated typed lambda calculus translations for approximately 2,000 sentences in Simple English Wikipedia, which is assumed to constitute a broad-coverage domain for precise, complex descriptions. The corpus described in this paper contains a large number of quantifiers and interesting scoping configurations, and is presented specifically as a resource for quantifier scope disambiguation systems, but also more generally as an object of linguistic study.
Anthology ID:
2020.lrec-1.132
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
1051–1060
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.132
DOI:
Bibkey:
Cite (ACL):
Nathan Rasmussen and William Schuler. 2020. A Corpus of Encyclopedia Articles with Logical Forms. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 1051–1060, Marseille, France. European Language Resources Association.
Cite (Informal):
A Corpus of Encyclopedia Articles with Logical Forms (Rasmussen & Schuler, LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.132.pdf