Alexa in the wild” – Collecting Unconstrained Conversations with a Modern Voice Assistant in a Public Environment

Ingo Siegert


Abstract
Datasets featuring modern voice assistants such as Alexa, Siri, Cortana and others allow an easy study of human-machine interactions. But data collections offering an unconstrained, unscripted public interaction are quite rare. Many studies so far have focused on private usage, short pre-defined task or specific domains. This contribution presents a dataset providing a large amount of unconstrained public interactions with a voice assistant. Up to now around 40 hours of device directed utterances were collected during a science exhibition touring through Germany. The data recording was part of an exhibit that engages visitors to interact with a commercial voice assistant system (Amazon’s ALEXA), but did not restrict them to a specific topic. A specifically developed quiz was starting point of the conversation, as the voice assistant was presented to the visitors as a possible joker for the quiz. But the visitors were not forced to solve the quiz with the help of the voice assistant and thus many visitors had an open conversation. The provided dataset – Voice Assistant Conversations in the wild (VACW) – includes the transcripts of both visitors requests and Alexa answers, identified topics and sessions as well as acoustic characteristics automatically extractable from the visitors’ audio files.
Anthology ID:
2020.lrec-1.77
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
615–619
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.77
DOI:
Bibkey:
Cite (ACL):
Ingo Siegert. 2020. “Alexa in the wild” – Collecting Unconstrained Conversations with a Modern Voice Assistant in a Public Environment. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 615–619, Marseille, France. European Language Resources Association.
Cite (Informal):
“Alexa in the wild” – Collecting Unconstrained Conversations with a Modern Voice Assistant in a Public Environment (Siegert, LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.77.pdf