Deep Reinforcement Learning for Mention-Ranking Coreference Models

Kevin Clark and Christopher D. Manning
Stanford University


Abstract

Coreference resolution systems are typically trained with heuristic loss functions that require careful tuning. In this paper we instead apply reinforcement learning to directly optimize a neural mention-ranking model for coreference evaluation metrics. We experiment with two approaches: the REINFORCE policy gradient algorithm and a reward-rescaled max-margin objective. We find the latter to be more effective, resulting in a significant improvement over the current state-of-the-art on the English and Chinese portions of the CoNLL 2012 Shared Task.