Classifying Syntactic Errors in Learner Language

Leshem Choshen, Dmitry Nikolaev, Yevgeni Berzak, Omri Abend


Abstract
We present a method for classifying syntactic errors in learner language, namely errors whose correction alters the morphosyntactic structure of a sentence. The methodology builds on the established Universal Dependencies syntactic representation scheme, and provides complementary information to other error-classification systems. Unlike existing error classification methods, our method is applicable across languages, which we showcase by producing a detailed picture of syntactic errors in learner English and learner Russian. We further demonstrate the utility of the methodology for analyzing the outputs of leading Grammatical Error Correction (GEC) systems.
Anthology ID:
2020.conll-1.7
Volume:
Proceedings of the 24th Conference on Computational Natural Language Learning
Month:
November
Year:
2020
Address:
Online
Editors:
Raquel Fernández, Tal Linzen
Venue:
CoNLL
SIG:
SIGNLL
Publisher:
Association for Computational Linguistics
Note:
Pages:
97–107
Language:
URL:
https://aclanthology.org/2020.conll-1.7
DOI:
10.18653/v1/2020.conll-1.7
Bibkey:
Cite (ACL):
Leshem Choshen, Dmitry Nikolaev, Yevgeni Berzak, and Omri Abend. 2020. Classifying Syntactic Errors in Learner Language. In Proceedings of the 24th Conference on Computational Natural Language Learning, pages 97–107, Online. Association for Computational Linguistics.
Cite (Informal):
Classifying Syntactic Errors in Learner Language (Choshen et al., CoNLL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.conll-1.7.pdf
Optional supplementary material:
 2020.conll-1.7.OptionalSupplementaryMaterial.pdf
Code
 borgr/GEC_UD_divergences
Data
Universal Dependencies