Similarity Scoring for Dialogue Behaviour Comparison

Stefan Ultes, Wolfgang Maier


Abstract
The differences in decision making between behavioural models of voice interfaces are hard to capture using existing measures for the absolute performance of such models. For instance, two models may have a similar task success rate, but very different ways of getting there. In this paper, we propose a general methodology to compute the similarity of two dialogue behaviour models and investigate different ways of computing scores on both the semantic and the textual level. Complementing absolute measures of performance, we test our scores on three different tasks and show the practical usability of the measures.
Anthology ID:
2020.sigdial-1.38
Volume:
Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Month:
July
Year:
2020
Address:
1st virtual meeting
Editors:
Olivier Pietquin, Smaranda Muresan, Vivian Chen, Casey Kennington, David Vandyke, Nina Dethlefs, Koji Inoue, Erik Ekstedt, Stefan Ultes
Venue:
SIGDIAL
SIG:
SIGDIAL
Publisher:
Association for Computational Linguistics
Note:
Pages:
311–322
Language:
URL:
https://aclanthology.org/2020.sigdial-1.38
DOI:
10.18653/v1/2020.sigdial-1.38
Bibkey:
Cite (ACL):
Stefan Ultes and Wolfgang Maier. 2020. Similarity Scoring for Dialogue Behaviour Comparison. In Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 311–322, 1st virtual meeting. Association for Computational Linguistics.
Cite (Informal):
Similarity Scoring for Dialogue Behaviour Comparison (Ultes & Maier, SIGDIAL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.sigdial-1.38.pdf
Video:
 https://youtube.com/watch?v=zs0yOpHWBf8