Universal Dependency Parsing from Scratch

Peng Qi, Timothy Dozat, Yuhao Zhang, Christopher D. Manning


Abstract
This paper describes Stanford’s system at the CoNLL 2018 UD Shared Task. We introduce a complete neural pipeline system that takes raw text as input, and performs all tasks required by the shared task, ranging from tokenization and sentence segmentation, to POS tagging and dependency parsing. Our single system submission achieved very competitive performance on big treebanks. Moreover, after fixing an unfortunate bug, our corrected system would have placed the 2nd, 1st, and 3rd on the official evaluation metrics LAS, MLAS, and BLEX, and would have outperformed all submission systems on low-resource treebank categories on all metrics by a large margin. We further show the effectiveness of different model components through extensive ablation studies.
Anthology ID:
K18-2016
Volume:
Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies
Month:
October
Year:
2018
Address:
Brussels, Belgium
Editors:
Daniel Zeman, Jan Hajič
Venue:
CoNLL
SIG:
SIGNLL
Publisher:
Association for Computational Linguistics
Note:
Pages:
160–170
Language:
URL:
https://aclanthology.org/K18-2016
DOI:
10.18653/v1/K18-2016
Bibkey:
Cite (ACL):
Peng Qi, Timothy Dozat, Yuhao Zhang, and Christopher D. Manning. 2018. Universal Dependency Parsing from Scratch. In Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pages 160–170, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Universal Dependency Parsing from Scratch (Qi et al., CoNLL 2018)
Copy Citation:
PDF:
https://aclanthology.org/K18-2016.pdf
Code
 stanfordnlp/stanfordnlp
Data
Universal Dependencies