Maximum Margin Reward Networks for Learning from Explicit and Implicit Supervision

Haoruo Peng, Ming-Wei Chang, Wen-tau Yih


Abstract
Neural networks have achieved state-of-the-art performance on several structured-output prediction tasks, trained in a fully supervised fashion. However, annotated examples in structured domains are often costly to obtain, which thus limits the applications of neural networks. In this work, we propose Maximum Margin Reward Networks, a neural network-based framework that aims to learn from both explicit (full structures) and implicit supervision signals (delayed feedback on the correctness of the predicted structure). On named entity recognition and semantic parsing, our model outperforms previous systems on the benchmark datasets, CoNLL-2003 and WebQuestionsSP.
Anthology ID:
D17-1252
Volume:
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Editors:
Martha Palmer, Rebecca Hwa, Sebastian Riedel
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
2368–2378
Language:
URL:
https://aclanthology.org/D17-1252
DOI:
10.18653/v1/D17-1252
Bibkey:
Cite (ACL):
Haoruo Peng, Ming-Wei Chang, and Wen-tau Yih. 2017. Maximum Margin Reward Networks for Learning from Explicit and Implicit Supervision. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 2368–2378, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
Maximum Margin Reward Networks for Learning from Explicit and Implicit Supervision (Peng et al., EMNLP 2017)
Copy Citation:
PDF:
https://aclanthology.org/D17-1252.pdf
Video:
 https://aclanthology.org/D17-1252.mp4
Data
CoNLL 2003WebQuestions