Turku Enhanced Parser Pipeline: From Raw Text to Enhanced Graphs in the IWPT 2020 Shared Task

Jenna Kanerva, Filip Ginter, Sampo Pyysalo


Abstract
We present the approach of the TurkuNLP group to the IWPT 2020 shared task on Multilingual Parsing into Enhanced Universal Dependencies. The task involves 28 treebanks in 17 different languages and requires parsers to generate graph structures extending on the basic dependency trees. Our approach combines language-specific BERT models, the UDify parser, neural sequence-to-sequence lemmatization and a graph transformation approach encoding the enhanced structure into a dependency tree. Our submission averaged 84.5% ELAS, ranking first in the shared task. We make all methods and resources developed for this study freely available under open licenses from https://turkunlp.org.
Anthology ID:
2020.iwpt-1.17
Volume:
Proceedings of the 16th International Conference on Parsing Technologies and the IWPT 2020 Shared Task on Parsing into Enhanced Universal Dependencies
Month:
July
Year:
2020
Address:
Online
Editors:
Gosse Bouma, Yuji Matsumoto, Stephan Oepen, Kenji Sagae, Djamé Seddah, Weiwei Sun, Anders Søgaard, Reut Tsarfaty, Dan Zeman
Venue:
IWPT
SIG:
SIGPARSE
Publisher:
Association for Computational Linguistics
Note:
Pages:
162–173
Language:
URL:
https://aclanthology.org/2020.iwpt-1.17
DOI:
10.18653/v1/2020.iwpt-1.17
Bibkey:
Cite (ACL):
Jenna Kanerva, Filip Ginter, and Sampo Pyysalo. 2020. Turku Enhanced Parser Pipeline: From Raw Text to Enhanced Graphs in the IWPT 2020 Shared Task. In Proceedings of the 16th International Conference on Parsing Technologies and the IWPT 2020 Shared Task on Parsing into Enhanced Universal Dependencies, pages 162–173, Online. Association for Computational Linguistics.
Cite (Informal):
Turku Enhanced Parser Pipeline: From Raw Text to Enhanced Graphs in the IWPT 2020 Shared Task (Kanerva et al., IWPT 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.iwpt-1.17.pdf
Video:
 http://slideslive.com/38929684