Semi-Supervised Tri-Training for Explicit Discourse Argument Expansion

René Knaebel, Manfred Stede


Abstract
This paper describes a novel application of semi-supervision for shallow discourse parsing. We use a neural approach for sequence tagging and focus on the extraction of explicit discourse arguments. First, additional unlabeled data is prepared for semi-supervised learning. From this data, weak annotations are generated in a first setting and later used in another setting to study performance differences. In our studies, we show an increase in the performance of our models that ranges between 2-10% F1 score. Further, we give some insights to the generated discourse annotations and compare the developed additional relations with the training relations. We release this new dataset of explicit discourse arguments to enable the training of large statistical models.
Anthology ID:
2020.lrec-1.139
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
1103–1109
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.139
DOI:
Bibkey:
Cite (ACL):
René Knaebel and Manfred Stede. 2020. Semi-Supervised Tri-Training for Explicit Discourse Argument Expansion. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 1103–1109, Marseille, France. European Language Resources Association.
Cite (Informal):
Semi-Supervised Tri-Training for Explicit Discourse Argument Expansion (Knaebel & Stede, LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.139.pdf