E2E NLG Challenge

Event Notification Type: 
Call for Participation
Abbreviated Title: 
Location: 
State: 
Country: 
City: 
Contact: 
Jekaterina Novikova, Ondrej Dusek, Verena Rieser
Heriot-Watt University, Edinburgh, UK
Submission Deadline: 
Tuesday, 31 October 2017

Shared task on end-to-end natural language generation from dialogue acts, using unaligned data.

MOTIVATION
Natural language generation plays a critical role for Conversational Agents as it has a significant impact on a user’s impression of the system. This shared task focuses on recent end-to-end (E2E), data-driven NLG methods, which jointly learn sentence planning and surface realisation from non-aligned data, e.g. (Wen et al., 2015; Mei et al., 2015; Dusek & Jurcicek, 2016; Lampouras & Vlachos, 2016) etc.

So far, E2E NLG approaches were limited to small, de-lexicalised data sets, e.g. BAGEL, SF Hotels/ Restaurants, or RoboCup. In this shared challenge, we will provide a new crowd-sourced data set of 50k instances in the restaurant domain, as described in (Novikova, Lemon & Rieser, 2016). Each instance consist of a dialogue act-based meaning representation (MR) and up to 5 references in natural language. In contrast to previously used data, our data set includes additional challenges, such as open vocabulary, complex syntactic structures and diverse discourse phenomena. For example:

MR:
name[The Eagle],
eatType[coffee shop],
food[French],
priceRange[moderate],
customerRating[3/5],
area[riverside],
kidsFriendly[yes],
near[Burger King]

NL:
“The three star coffee shop, The Eagle, gives families a mid-priced dining experience featuring a variety of wines and cheeses. Find The Eagle near Burger King.”

The full data set will be released to participants according to the timeline below. A sample of the data can be obtained here.

This challenge follows on from previous successful shared tasks on generation, e.g. SemEval’17 task 9 on text generation from AMR, and Generation Challenges 2008-11. However, this is the first NLG task to concentrate on (1) generation from dialogue acts, (2) using semantically un-aligned data.

PROPOSED TASK
The task is to generate an utterance from a given MR, which is a) similar to human generated reference texts, and b) highly rated by humans. Similarity will be assessed using standard metrics, such as BLEU and METEOR. Human ratings will be obtained using a mixture of crowd-sourcing and expert annotations. We will also test a suite of novel metrics to estimate the quality of a generated utterance.

REGISTER INTEREST:
Please register here if you would like to receive a link to the full data set (50k instances): https://goo.gl/forms/DAj41Pl6IevZSTWl1

IMPORTANT DATES

  • 13 March 2017: Registration opens
  • 27 March 2017: Training and development data are released (MRs + references)
  • May 2017: Baseline is released.
  • 16 October 2017: Test data is released (MRs only)
  • 31 October 2017: Entry submission deadline
  • 15 Nov 2017: Evaluation results are released
  • 15 December 2017: Participants submit a paper describing their systems
  • February 2018: Results presented at workshop

ORGANISING COMMITTEE
Jekaterina Novikova, Ondrej Dusek, Verena Rieser
Heriot-Watt University, Edinburgh, UK.

CONTACT DETAILS
e2e-nlg-challenge [at] googlegroups.com

ADVISORY COMMITTEE
Amanda Stent, Bloomberg
Andreas Vlachos, University of Sheffield
Marilyn Walker, University of California Santa Cruz
Matthew Walter, Toyota Technological Institute at Chicago
Shawn Wen, University of Cambridge
Luke Zettlemoyer, University of Washington