Could Speaker, Gender or Age Awareness be beneficial in Speech-based Emotion Recognition?

Maxim Sidorov, Alexander Schmitt, Eugene Semenkin, Wolfgang Minker


Abstract
Emotion Recognition (ER) is an important part of dialogue analysis which can be used in order to improve the quality of Spoken Dialogue Systems (SDSs). The emotional hypothesis of the current response of an end-user might be utilised by the dialogue manager component in order to change the SDS strategy which could result in a quality enhancement. In this study additional speaker-related information is used to improve the performance of the speech-based ER process. The analysed information is the speaker identity, gender and age of a user. Two schemes are described here, namely, using additional information as an independent variable within the feature vector and creating separate emotional models for each speaker, gender or age-cluster independently. The performances of the proposed approaches were compared against the baseline ER system, where no additional information has been used, on a number of emotional speech corpora of German, English, Japanese and Russian. The study revealed that for some of the corpora the proposed approach significantly outperforms the baseline methods with a relative difference of up to 11.9%.
Anthology ID:
L16-1010
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
61–68
Language:
URL:
https://aclanthology.org/L16-1010
DOI:
Bibkey:
Cite (ACL):
Maxim Sidorov, Alexander Schmitt, Eugene Semenkin, and Wolfgang Minker. 2016. Could Speaker, Gender or Age Awareness be beneficial in Speech-based Emotion Recognition?. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 61–68, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
Could Speaker, Gender or Age Awareness be beneficial in Speech-based Emotion Recognition? (Sidorov et al., LREC 2016)
Copy Citation:
PDF:
https://aclanthology.org/L16-1010.pdf