Not All Dialogues are Created Equal: Instance Weighting for Neural Conversational Models

Pierre Lison, Serge Bibauw


Abstract
Neural conversational models require substantial amounts of dialogue data to estimate their parameters and are therefore usually learned on large corpora such as chat forums or movie subtitles. These corpora are, however, often challenging to work with, notably due to their frequent lack of turn segmentation and the presence of multiple references external to the dialogue itself. This paper shows that these challenges can be mitigated by adding a weighting model into the architecture. The weighting model, which is itself estimated from dialogue data, associates each training example to a numerical weight that reflects its intrinsic quality for dialogue modelling. At training time, these sample weights are included into the empirical loss to be minimised. Evaluation results on retrieval-based models trained on movie and TV subtitles demonstrate that the inclusion of such a weighting model improves the model performance on unsupervised metrics.
Anthology ID:
W17-5546
Volume:
Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue
Month:
August
Year:
2017
Address:
Saarbrücken, Germany
Editors:
Kristiina Jokinen, Manfred Stede, David DeVault, Annie Louis
Venue:
SIGDIAL
SIG:
SIGDIAL
Publisher:
Association for Computational Linguistics
Note:
Pages:
384–394
Language:
URL:
https://aclanthology.org/W17-5546
DOI:
10.18653/v1/W17-5546
Bibkey:
Cite (ACL):
Pierre Lison and Serge Bibauw. 2017. Not All Dialogues are Created Equal: Instance Weighting for Neural Conversational Models. In Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue, pages 384–394, Saarbrücken, Germany. Association for Computational Linguistics.
Cite (Informal):
Not All Dialogues are Created Equal: Instance Weighting for Neural Conversational Models (Lison & Bibauw, SIGDIAL 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-5546.pdf
Data
OpenSubtitles