We are pleased to announce the opening of the third Dialog State Tracking Challenge (DSTC3). Complete information, including the challenge handbook, training data, evaluation scripts, and baseline trackers are available on the DSTC3 website:
The Dialog State Tracking Challenge (DSTC) is a research challenge focused on improving the state of the art in tracking the state of spoken dialog systems. State tracking refers to accurately estimating the user's goal as a dialog progresses. Accurate state tracking is desirable because it provides robustness to errors in speech recognition, and helps reduce ambiguity inherent in language within a temporal process like dialog.
In this challenge, participants are given labelled corpora of dialogs to develop state tracking algorithms. The trackers will then be evaluated on a common set of held-out dialogs which are released, un-labelled, during a one week period. This is a corpus-based challenge: participants do not need to implement a speech recognizer, a semantic parser, or an end-to-end dialog system.
The first DSTC completed in 2013, with 9 teams participating and a total of 27 entries, with 9 papers presented at SIGDIAL 2013, advancing the state-of-the-art in several dimensions. DSTC2 introduced a completely new dataset, in a new domain (restaurant information), with more complicated and dynamic dialog states that may change throughout the dialog. DSTC2 concluded a few months ago, again with 9 participating teams (about half new) -- results have been submitted to and will be presented at a special session at SIGDIAL 2014.
DSTC3 will focus on the task of adapting and expanding to a new domain, when there is a lot of labelled data in a smaller domain. The "smaller domain" is the restaurants domain from DSTC2; the "new extended domain" is a larger tourist information domain: DSTC3 includes restaurants and adds pubs and coffee shops, and more detail (slots) for restaurants relative to the DSTC2 data.
Participants are encouraged to submit papers describing their work to SLT 2014, whose deadline will be approx. 20 July. The organisers are awaiting confirmation of a proposed special session at the conference.
- 4 April 2014 : Labelled tourist information seed set released
- 9 June 2014 : Unlabelled tourist information test set released
- 16 June 2014 : Tracker output on tourist information test set due
- 23 June 2014 : Results on tourist information test set given to participants
- 20 July 2014 : SLT paper deadline (approximate)
- 7-10 Dec 2014 : SLT workshop (Lake Tahoe, Nevada, USA)
The training data, scoring scripts, baselines, domain ontology and database are all available for public download. Prospective participants are strongly encouraged to join the mailing list, to ensure you receive notifications of updates to data or scripts, and are included in discussions about the challenge. To join, email firstname.lastname@example.org with 'subscribe DSTC' in the body of the message (without quotes).
Feel free to direct questions to the organizers. We hope you will consider participating!
Matt Henderson (lead) - Cambridge University [email@example.com]
Blaise Thomson - Cambridge University [firstname.lastname@example.org]
Jason D. Williams - Microsoft Research [email@example.com]
DSTC3 advisory board
Bill Byrne - University of Cambridge
Paul Crook - Microsoft Research
Maxine Eskenazi - Carnegie Mellon University
Milica Gasic - University of Cambridge
Helen Hastie - Herriot Watt
Kee-Eung Kim - KAIST
Sungjin Lee - Carnegie Mellon University
Oliver Lemon - Herriot Watt
Olivier Pietquin - SUPELEC
Joelle Pineau - McGill University
Deepak Ramachandran - Nuance Communications
Brian Strope - Google
Steve Young - University of Cambridge