Preliminary Program

The Gun Violence Database: A new task and data set for NLP

Ellie Pavlick¹, Heng Ji², Xiaoman Pan², Chris Callison-Burch¹
¹University of Pennsylvania, ²Rensselaer Polytechnic Institute

Abstract

We argue that NLP researchers are especially well-positioned to contribute to the national discussion about gun violence. Reasoning about the causes and outcomes of gun violence is typically dominated by politics and emotion, and data-driven research on the topic is stymied by a shortage of data and a lack of federal funding. However, data abounds in the form of unstructured text from news articles across the country. This is an ideal application of NLP technologies, such as relation extraction, coreference resolution, and event detection. We introduce a new and growing dataset, the Gun Violence Database, in order to facilitate the adaptation of current NLP technologies to the domain of gun violence, thus enabling better social science research on this important and under-resourced problem.