Measuring text readability with machine comprehension: a pilot study

Marc Benzahra, François Yvon


Abstract
This article studies the relationship between text readability indice and automatic machine understanding systems. Our hypothesis is that the simpler a text is, the better it should be understood by a machine. We thus expect to a strong correlation between readability levels on the one hand, and performance of automatic reading systems on the other hand. We test this hypothesis with several understanding systems based on language models of varying strengths, measuring this correlation on two corpora of journalistic texts. Our results suggest that this correlation is rather small that existing comprehension systems are far to reproduce the gradual improvement of their performance on texts of decreasing complexity.
Anthology ID:
W19-4443
Volume:
Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications
Month:
August
Year:
2019
Address:
Florence, Italy
Editors:
Helen Yannakoudakis, Ekaterina Kochmar, Claudia Leacock, Nitin Madnani, Ildikó Pilán, Torsten Zesch
Venue:
BEA
SIG:
SIGEDU
Publisher:
Association for Computational Linguistics
Note:
Pages:
412–422
Language:
URL:
https://aclanthology.org/W19-4443
DOI:
10.18653/v1/W19-4443
Bibkey:
Cite (ACL):
Marc Benzahra and François Yvon. 2019. Measuring text readability with machine comprehension: a pilot study. In Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications, pages 412–422, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Measuring text readability with machine comprehension: a pilot study (Benzahra & Yvon, BEA 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-4443.pdf
Data
WebText