An Evaluation of Parser Robustness for Ungrammatical Sentences

Homa B. Hashemi1 and Rebecca Hwa2
1Intelligent Systems Program, University of Pittsburgh, 2University of Pittsburgh


Abstract

For many NLP applications that require a parser, the sentences of interest may not be well-formed. If the parser can overlook problems such as grammar mistakes and produce a parse tree that closely resembles the correct analysis for the intended sentence, we say that the parser is robust. This paper compares the performances of eight state-of-the-art dependency parsers on two domains of ungrammatical sentences: learner English and machine translation outputs. We have developed an evaluation metric and conducted a suite of experiments. Our analyses may help practitioners to choose an appropriate parser for their tasks, and help developers to improve parser robustness against ungrammatical sentences.