Enriched In-Order Linearization for Faster Sequence-to-Sequence Constituent Parsing

Daniel Fernández-González, Carlos Gómez-Rodríguez


Abstract
Sequence-to-sequence constituent parsing requires a linearization to represent trees as sequences. Top-down tree linearizations, which can be based on brackets or shift-reduce actions, have achieved the best accuracy to date. In this paper, we show that these results can be improved by using an in-order linearization instead. Based on this observation, we implement an enriched in-order shift-reduce linearization inspired by Vinyals et al. (2015)’s approach, achieving the best accuracy to date on the English PTB dataset among fully-supervised single-model sequence-to-sequence constituent parsers. Finally, we apply deterministic attention mechanisms to match the speed of state-of-the-art transition-based parsers, thus showing that sequence-to-sequence models can match them, not only in accuracy, but also in speed.
Anthology ID:
2020.acl-main.376
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Editors:
Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4092–4099
Language:
URL:
https://aclanthology.org/2020.acl-main.376
DOI:
10.18653/v1/2020.acl-main.376
Bibkey:
Cite (ACL):
Daniel Fernández-González and Carlos Gómez-Rodríguez. 2020. Enriched In-Order Linearization for Faster Sequence-to-Sequence Constituent Parsing. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 4092–4099, Online. Association for Computational Linguistics.
Cite (Informal):
Enriched In-Order Linearization for Faster Sequence-to-Sequence Constituent Parsing (Fernández-González & Gómez-Rodríguez, ACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.acl-main.376.pdf
Video:
 http://slideslive.com/38928747