Span-Based Constituency Parsing with a Structure-Label System and Provably Optimal Dynamic Oracles

James Cross1 and Liang Huang2
1Oregon State University, 2City University of New York (CUNY)


Abstract

Parsing accuracy using extremely efficient greedy transition-based parsers has improved dramatically in recent years thanks to neural-network learning models. Despite striking results in dependency parsing, however, neural models have not surpassed state-of-the-art approaches in constituency parsing. To remedy this, we introduce a new parsing system which uses a stack of sentence spans, represented by a bare minimum of LSTM features. We also describe a dynamic oracle for this constituency parsing system. Training with this oracle, we achieve the best performance on Penn Treebank of any parser that does not use reranking or any external data.