An Empirical Comparison of Unsupervised Constituency Parsing Methods

Jun Li, Yifan Cao, Jiong Cai, Yong Jiang, Kewei Tu


Abstract
Unsupervised constituency parsing aims to learn a constituency parser from a training corpus without parse tree annotations. While many methods have been proposed to tackle the problem, including statistical and neural methods, their experimental results are often not directly comparable due to discrepancies in datasets, data preprocessing, lexicalization, and evaluation metrics. In this paper, we first examine experimental settings used in previous work and propose to standardize the settings for better comparability between methods. We then empirically compare several existing methods, including decade-old and newly proposed ones, under the standardized settings on English and Japanese, two languages with different branching tendencies. We find that recent models do not show a clear advantage over decade-old models in our experiments. We hope our work can provide new insights into existing methods and facilitate future empirical evaluation of unsupervised constituency parsing.
Anthology ID:
2020.acl-main.300
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Editors:
Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3278–3283
Language:
URL:
https://aclanthology.org/2020.acl-main.300
DOI:
10.18653/v1/2020.acl-main.300
Bibkey:
Cite (ACL):
Jun Li, Yifan Cao, Jiong Cai, Yong Jiang, and Kewei Tu. 2020. An Empirical Comparison of Unsupervised Constituency Parsing Methods. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 3278–3283, Online. Association for Computational Linguistics.
Cite (Informal):
An Empirical Comparison of Unsupervised Constituency Parsing Methods (Li et al., ACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.acl-main.300.pdf
Video:
 http://slideslive.com/38929344