Unsupervised Discourse Constituency Parsing Using Viterbi EM

Noriki Nishida, Hideki Nakayama


Abstract
In this paper, we introduce an unsupervised discourse constituency parsing algorithm. We use Viterbi EM with a margin-based criterion to train a span-based discourse parser in an unsupervised manner. We also propose initialization methods for Viterbi training of discourse constituents based on our prior knowledge of text structures. Experimental results demonstrate that our unsupervised parser achieves comparable or even superior performance to fully supervised parsers. We also investigate discourse constituents that are learned by our method.
Anthology ID:
2020.tacl-1.15
Volume:
Transactions of the Association for Computational Linguistics, Volume 8
Month:
Year:
2020
Address:
Cambridge, MA
Editors:
Mark Johnson, Brian Roark, Ani Nenkova
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
215–230
Language:
URL:
https://aclanthology.org/2020.tacl-1.15
DOI:
10.1162/tacl_a_00312
Bibkey:
Cite (ACL):
Noriki Nishida and Hideki Nakayama. 2020. Unsupervised Discourse Constituency Parsing Using Viterbi EM. Transactions of the Association for Computational Linguistics, 8:215–230.
Cite (Informal):
Unsupervised Discourse Constituency Parsing Using Viterbi EM (Nishida & Nakayama, TACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.tacl-1.15.pdf
Code
 norikinishida/DiscourseConstituencyInduction-ViterbiEM