DSTC10 Track 2 - Knowledge-grounded Task-oriented Dialogue Modeling on Spoken Conversations

This repository contains the data, scripts and baseline codes for DSTC10 Track 2.

This challenge track aims to benchmark the robustness of the conversational models against the gaps between written and spoken conversations. Specifically, it includes two target tasks: 1) multi-domain dialogue state tracking and 2) task-oriented conversational modeling with unstructured knowledge access. For both tasks, participants will develop models using any existing public data and submit the model outputs on the unlabeled test data set with the ASR outputs.

Organizers: Seokhwan Kim, Yang Liu, Di Jin, Alexandros Papangelis, Behnam Hedayatnia, Karthik Gopalakrishnan, Dilek Hakkani-Tur

News

March 30, 2022 - The post-challenge leaderboard for Task 2 is online here.
October 11, 2021 - The ground-truth labels/responses of the test data are released at Task 1 Labels, Task 2 Labels
October 11, 2021 - All the submitted entries by the participants are released at Task 1 Submissions, Task 2 Submissions
October 8, 2021 - The human evaluation scores for Task 2 are now available: Task 2 Results
September 27, 2021 - The objective evaluation results are now available: Task 1 Results, Task 2 Results
September 13, 2021 - The evaluation data is released for Task 1 and Task 2. Please find the participation details for Task 1 and Task 2.
August 25, 2021 - Frequently Asked Questions added.
August 12, 2021 - Patched data/code released for task 1. Please find the details from release notes and update your local branch.

Important Links

Track Proposal
Challenge Registration
Task 1 Details
Task 2 Details
Objective Evaluation Results
- Task 1 Results
- Task 2 Results
Human Evaluation Results
- Task 2 Results
Submitted Entries
- Task 1 Submissions
- Task 2 Submissions (including the human evaluation scores for each finalist)
Ground-truth Labels
- Task 1 Labels
- Task 2 Labels

If you want to publish experimental results with this dataset or use the baseline models, please cite this article:

@misc{kim2021how,
      title={"How robust r u?": Evaluating Task-Oriented Dialogue Systems on Spoken Conversations}, 
      author={Seokhwan Kim and Yang Liu and Di Jin and Alexandros Papangelis and Karthik Gopalakrishnan and Behnam Hedayatnia and Dilek Hakkani-Tur},
      year={2021},
      eprint={2109.13489},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Timeline

Validation data released: Jun 14, 2021
Test data released: Sep 13, 2021
Entry submission deadline: Sep 21, 2021
Objective evaluation completed: Sep 28, 2021
Human evaluation completed: Oct 8, 2021

Rules

Participation is welcome from any team (academic, corporate, non profit, government).
Each team can participate in either or both sub-tracks by submitting up to 5 entries for each track.
The identity of participants will NOT be published or made public. In written results, teams will be identified as team IDs (e.g. team1, team2, etc). The organizers will verbally indicate the identities of all teams at the workshop chosen for communicating results.
Participants may identify their own team label (e.g. team5), in publications or presentations, if they desire, but may not identify the identities of other teams.
Participants are allowed to use any external datasets, resources or pre-trained models which are publicly available.
Participants are NOT allowed to do any manual examination or modification of the test data.
All the submitted system outputs with the evaluation results will be released to public after the evaluation period.

Contact

Join the DSTC mailing list to get the latest updates about DSTC10

To join the mailing list: visit https://groups.google.com/a/dstc.community/forum/#!forum/list/join
To post a message: send your message to list@dstc.community
To leave the mailing list: visit https://groups.google.com/a/dstc.community/forum/#!forum/list/unsubscribe

For specific enquiries about DSTC10 Track2

Please feel free to contact: seokhwk (at) amazon (dot) com

alexa/alexa-with-dstc10-track2-dataset