Challenges of Using Text Classifiers for Causal Inference

Zach Wood-Doughty, Ilya Shpitser, Mark Dredze


Abstract
Causal understanding is essential for many kinds of decision-making, but causal inference from observational data has typically only been applied to structured, low-dimensional datasets. While text classifiers produce low-dimensional outputs, their use in causal inference has not previously been studied. To facilitate causal analyses based on language data, we consider the role that text classifiers can play in causal inference through established modeling mechanisms from the causality literature on missing data and measurement error. We demonstrate how to conduct causal analyses using text classifiers on simulated and Yelp data, and discuss the opportunities and challenges of future work that uses text data in causal inference.
Anthology ID:
D18-1488
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:
October-November
Year:
2018
Address:
Brussels, Belgium
Editors:
Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
4586–4598
Language:
URL:
https://aclanthology.org/D18-1488
DOI:
10.18653/v1/D18-1488
Bibkey:
Cite (ACL):
Zach Wood-Doughty, Ilya Shpitser, and Mark Dredze. 2018. Challenges of Using Text Classifiers for Causal Inference. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 4586–4598, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Challenges of Using Text Classifiers for Causal Inference (Wood-Doughty et al., EMNLP 2018)
Copy Citation:
PDF:
https://aclanthology.org/D18-1488.pdf
Code
 zachwooddoughty/emnlp2018-causal