An Automatic Approach for Document-level Topic Model Evaluation

Shraey Bhatia; Jey Han Lau; Timothy Baldwin

doi:10.18653/v1/K17-1022

An Automatic Approach for Document-level Topic Model Evaluation

Shraey Bhatia, Jey Han Lau, Timothy Baldwin

Abstract

Topic models jointly learn topics and document-level topic distribution. Extrinsic evaluation of topic models tends to focus exclusively on topic-level evaluation, e.g. by assessing the coherence of topics. We demonstrate that there can be large discrepancies between topic- and document-level model quality, and that basing model evaluation on topic-level analysis can be highly misleading. We propose a method for automatically predicting topic model quality based on analysis of document-level topic allocations, and provide empirical evidence for its robustness.

Anthology ID:: K17-1022
Volume:: Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)
Month:: August
Year:: 2017
Address:: Vancouver, Canada
Editors:: Roger Levy, Lucia Specia
Venue:: CoNLL
SIG:: SIGNLL
Publisher:: Association for Computational Linguistics
Note:
Pages:: 206–215
Language:
URL:: https://aclanthology.org/K17-1022/
DOI:: 10.18653/v1/K17-1022
Bibkey:
Cite (ACL):: Shraey Bhatia, Jey Han Lau, and Timothy Baldwin. 2017. An Automatic Approach for Document-level Topic Model Evaluation. In Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), pages 206–215, Vancouver, Canada. Association for Computational Linguistics.
Cite (Informal):: An Automatic Approach for Document-level Topic Model Evaluation (Bhatia et al., CoNLL 2017)
Copy Citation:
PDF:: https://aclanthology.org/K17-1022.pdf

PDF Cite Search Fix data