A Data-Oriented Model of Literary Language

Andreas van Cranenburgh, Rens Bod


Abstract
We consider the task of predicting how literary a text is, with a gold standard from human ratings. Aside from a standard bigram baseline, we apply rich syntactic tree fragments, mined from the training set, and a series of hand-picked features. Our model is the first to distinguish degrees of highly and less literary novels using a variety of lexical and syntactic features, and explains 76.0 % of the variation in literary ratings.
Anthology ID:
E17-1115
Volume:
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers
Month:
April
Year:
2017
Address:
Valencia, Spain
Editors:
Mirella Lapata, Phil Blunsom, Alexander Koller
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1228–1238
Language:
URL:
https://aclanthology.org/E17-1115
DOI:
Bibkey:
Cite (ACL):
Andreas van Cranenburgh and Rens Bod. 2017. A Data-Oriented Model of Literary Language. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pages 1228–1238, Valencia, Spain. Association for Computational Linguistics.
Cite (Informal):
A Data-Oriented Model of Literary Language (van Cranenburgh & Bod, EACL 2017)
Copy Citation:
PDF:
https://aclanthology.org/E17-1115.pdf