Sprite: Generalizing Topic Models with Structured Priors

Michael J. Paul, Mark Dredze


Abstract
We introduce Sprite, a family of topic models that incorporates structure into model priors as a function of underlying components. The structured priors can be constrained to model topic hierarchies, factorizations, correlations, and supervision, allowing Sprite to be tailored to particular settings. We demonstrate this flexibility by constructing a Sprite-based model to jointly infer topic hierarchies and author perspective, which we apply to corpora of political debates and online reviews. We show that the model learns intuitive topics, outperforming several other topic models at predictive tasks.
Anthology ID:
Q15-1004
Volume:
Transactions of the Association for Computational Linguistics, Volume 3
Month:
Year:
2015
Address:
Cambridge, MA
Editors:
Michael Collins, Lillian Lee
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
43–57
Language:
URL:
https://aclanthology.org/Q15-1004
DOI:
10.1162/tacl_a_00121
Bibkey:
Cite (ACL):
Michael J. Paul and Mark Dredze. 2015. Sprite: Generalizing Topic Models with Structured Priors. Transactions of the Association for Computational Linguistics, 3:43–57.
Cite (Informal):
Sprite: Generalizing Topic Models with Structured Priors (Paul & Dredze, TACL 2015)
Copy Citation:
PDF:
https://aclanthology.org/Q15-1004.pdf